Comparing d4de2bb368..ef91101ee2 - git - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Junio C Hamano	bf949ade81	Git 2.32-rc0 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-16 21:05:24 +09:00
Junio C Hamano	e004fd6b69	Merge branch 'ls/typofix' * ls/typofix: pretty: fix a typo in the documentation for %(trailers)	2021-05-16 21:05:24 +09:00
Junio C Hamano	a8a2491e62	Merge branch 'dl/stash-show-untracked-fixup' The code to handle options recently added to "git stash show" around untracked part of the stash segfaulted when these options were used on a stash entry that does not record untracked part. * dl/stash-show-untracked-fixup: stash show: fix segfault with --{include,only}-untracked t3905: correct test title	2021-05-16 21:05:24 +09:00
Junio C Hamano	16f91451fa	Merge branch 'wc/packed-ref-removal-cleanup' When "git update-ref -d" removes a ref that is packed, it left empty directories under $GIT_DIR/refs/ for * wc/packed-ref-removal-cleanup: refs: cleanup directories when deleting packed ref	2021-05-16 21:05:24 +09:00
Junio C Hamano	94294e92e1	Merge branch 'lh/maintenance-leakfix' * lh/maintenance-leakfix: maintenance: fix two memory leaks	2021-05-16 21:05:24 +09:00
Junio C Hamano	caf6840be0	Merge branch 'ma/typofixes' A couple of trivial typofixes. * ma/typofixes: pretty-formats.txt: add missing space git-repack.txt: remove spurious ")"	2021-05-16 21:05:24 +09:00
Junio C Hamano	c7c7c460f8	Merge branch 'ah/merge-ort-i18n' An i18n fix. * ah/merge-ort-i18n: merge-ort: split "distinct types" message into two translatable messages	2021-05-16 21:05:23 +09:00
Junio C Hamano	483932a3d8	Merge branch 'dd/mailinfo-quoted-cr' "git mailinfo" (hence "git am") learned the "--quoted-cr" option to control how lines ending with CRLF wrapped in base64 or qp are handled. * dd/mailinfo-quoted-cr: am: learn to process quoted lines that ends with CRLF mailinfo: allow stripping quoted CR without warning mailinfo: allow squelching quoted CRLF warning mailinfo: warn if CRLF found in decoded base64/QP email mailinfo: stop parsing options manually mailinfo: load default metainfo_charset lazily	2021-05-16 21:05:23 +09:00
Junio C Hamano	c8e34a7ac2	Merge branch 'ab/sparse-index-cleanup' Code clean-up. * ab/sparse-index-cleanup: sparse-index.c: remove set_index_sparse_config()	2021-05-16 21:05:23 +09:00
Junio C Hamano	502a67891c	Merge branch 'ab/streaming-simplify' Code clean-up. * ab/streaming-simplify: streaming.c: move {open,close,read} from vtable to "struct git_istream" streaming.c: stop passing around "object_info *" to open() streaming.c: remove {open,close,read}_method_decl() macros streaming.c: remove enum/function/vtbl indirection streaming.c: avoid forward declarations	2021-05-16 21:05:23 +09:00
Junio C Hamano	a737e1f1d2	Merge branch 'mt/parallel-checkout-part-3' The final part of "parallel checkout". * mt/parallel-checkout-part-3: ci: run test round with parallel-checkout enabled parallel-checkout: add tests related to .gitattributes t0028: extract encoding helpers to lib-encoding.sh parallel-checkout: add tests related to path collisions parallel-checkout: add tests for basic operations checkout-index: add parallel checkout support builtin/checkout.c: complete parallel checkout support make_transient_cache_entry(): optionally alloc from mem_pool	2021-05-16 21:05:23 +09:00
Junio C Hamano	644f4a2046	Merge branch 'jt/push-negotiation' "git push" learns to discover common ancestor with the receiving end over protocol v2. * jt/push-negotiation: send-pack: support push negotiation fetch: teach independent negotiation (no packfile) fetch-pack: refactor command and capability write fetch-pack: refactor add_haves() fetch-pack: refactor process_acks()	2021-05-16 21:05:22 +09:00
Junio C Hamano	97eea85a0a	The seventeenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-14 08:26:11 +09:00
Junio C Hamano	52371bf449	Merge branch 'mt/clean-clean' Code clean-up. * mt/clean-clean: clean: remove unnecessary variable	2021-05-14 08:26:11 +09:00
Junio C Hamano	47fa106617	Merge branch 'ow/no-dryrun-in-add-i' "git add -i --dry-run" does not dry-run, which was surprising. The combination of options has taught to error out. * ow/no-dryrun-in-add-i: add: die if both --dry-run and --interactive are given	2021-05-14 08:26:09 +09:00
Junio C Hamano	e289f681ed	Merge branch 'jk/p4-locate-branch-point-optim' "git p4" learned to find branch points more efficiently. * jk/p4-locate-branch-point-optim: git-p4: speed up search for branch parent git-p4: ensure complex branches are cloned correctly	2021-05-14 08:26:08 +09:00
Junio C Hamano	eede71149e	Merge branch 'ba/object-info' Over-the-wire protocol learns a new request type to ask for object sizes given a list of object names. * ba/object-info: object-info: support for retrieving object info	2021-05-14 08:26:08 +09:00
Junio C Hamano	daffa8961b	Merge branch 'pw/patience-diff-clean-up' Code clean-up. * pw/patience-diff-clean-up: patience diff: remove unused variable patience diff: remove unnecessary string comparisons	2021-05-14 08:26:08 +09:00
Junio C Hamano	65c18913de	Merge branch 'pw/word-diff-zero-width-matches' The word-diff mode has been taught to work better with a word regexp that can match an empty string. * pw/word-diff-zero-width-matches: word diff: handle zero length matches	2021-05-14 08:26:06 +09:00
Denton Liu	1ff595d218	stash show: fix segfault with --{include,only}-untracked When `git stash show --include-untracked` or `git stash show --only-untracked` is run on a stash that doesn't include an untracked entry, a segfault occurs. This happens because we do not check whether the untracked entry is actually present and just attempt to blindly dereference it. Ensure that the untracked entry is present before actually attempting to dereference it. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-13 08:48:59 +09:00
Denton Liu	aa2b05d9f6	t3905: correct test title We reference the non-existent option `git stash show --show-untracked` when we really meant `--only-untracked`. Correct the test title accordingly. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-13 08:48:16 +09:00
Louis Sautier	e6f68f62e0	pretty: fix a typo in the documentation for %(trailers) Signed-off-by: Louis Sautier <sautier.louis@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-13 07:47:51 +09:00
Lénaïc Huard	c5d0b12a4c	maintenance: fix two memory leaks Fixes two memory leaks when running `git maintenance start` or `git maintenance stop` in `update_background_schedule`: $ valgrind --leak-check=full ~/git/bin/git maintenance start ==76584== Memcheck, a memory error detector ==76584== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==76584== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info ==76584== Command: /home/lenaic/git/bin/git maintenance start ==76584== ==76584== ==76584== HEAP SUMMARY: ==76584== in use at exit: 34,880 bytes in 252 blocks ==76584== total heap usage: 820 allocs, 568 frees, 146,414 bytes allocated ==76584== ==76584== 65 bytes in 1 blocks are definitely lost in loss record 17 of 39 ==76584== at 0x483E6AF: malloc (vg_replace_malloc.c:306) ==76584== by 0x3DC39C: xrealloc (wrapper.c:126) ==76584== by 0x3992CC: strbuf_grow (strbuf.c:98) ==76584== by 0x39A473: strbuf_vaddf (strbuf.c:392) ==76584== by 0x39BC54: xstrvfmt (strbuf.c:979) ==76584== by 0x39BD2C: xstrfmt (strbuf.c:989) ==76584== by 0x18451B: update_background_schedule (gc.c:1977) ==76584== by 0x1846F6: maintenance_start (gc.c:2011) ==76584== by 0x1847B4: cmd_maintenance (gc.c:2030) ==76584== by 0x127A2E: run_builtin (git.c:453) ==76584== by 0x127E81: handle_builtin (git.c:704) ==76584== by 0x128142: run_argv (git.c:771) ==76584== ==76584== 240 bytes in 1 blocks are definitely lost in loss record 29 of 39 ==76584== at 0x4840D7B: realloc (vg_replace_malloc.c:834) ==76584== by 0x491CE5D: getdelim (in /usr/lib/libc-2.33.so) ==76584== by 0x39ADD7: strbuf_getwholeline (strbuf.c:635) ==76584== by 0x39AF31: strbuf_getdelim (strbuf.c:706) ==76584== by 0x39B064: strbuf_getline_lf (strbuf.c:727) ==76584== by 0x184273: crontab_update_schedule (gc.c:1919) ==76584== by 0x184678: update_background_schedule (gc.c:1997) ==76584== by 0x1846F6: maintenance_start (gc.c:2011) ==76584== by 0x1847B4: cmd_maintenance (gc.c:2030) ==76584== by 0x127A2E: run_builtin (git.c:453) ==76584== by 0x127E81: handle_builtin (git.c:704) ==76584== by 0x128142: run_argv (git.c:771) ==76584== ==76584== LEAK SUMMARY: ==76584== definitely lost: 305 bytes in 2 blocks ==76584== indirectly lost: 0 bytes in 0 blocks ==76584== possibly lost: 0 bytes in 0 blocks ==76584== still reachable: 34,575 bytes in 250 blocks ==76584== suppressed: 0 bytes in 0 blocks ==76584== Reachable blocks (those to which a pointer was found) are not shown. ==76584== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==76584== ==76584== For lists of detected and suppressed errors, rerun with: -s ==76584== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr> Acked-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-12 07:00:45 +09:00
Junio C Hamano	df6c4f722c	The sixteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-11 15:27:23 +09:00
Junio C Hamano	2cd6ce21f3	Merge branch 'zh/trailer-cmd' The way the command line specified by the trailer.<token>.command configuration variable receives the end-user supplied value was both error prone and misleading. An alternative to achieve the same goal in a safer and more intuitive way has been added, as the trailer.<token>.cmd configuration variable, to replace it. * zh/trailer-cmd: trailer: add new .cmd config option docs: correct descript of trailer.<token>.command	2021-05-11 15:27:23 +09:00
Junio C Hamano	416449eaba	Merge branch 'jk/symlinked-dotgitx-cleanup' Various test and documentation updates about .gitsomething paths that are symlinks. * jk/symlinked-dotgitx-cleanup: docs: document symlink restrictions for dot-files fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW t0060: test ntfs/hfs-obscured dotfiles t7450: test .gitmodules symlink matching against obscured names t7450: test verify_path() handling of gitmodules t7415: rename to expand scope fsck_tree(): wrap some long lines fsck_tree(): fix shadowed variable t7415: remove out-dated comment about translation	2021-05-11 15:27:23 +09:00
Junio C Hamano	1af57f5d32	Merge branch 'jk/pack-objects-negative-options-fix' Options to "git pack-objects" that take numeric values like --window and --depth should not accept negative values; the input validation has been tightened. * jk/pack-objects-negative-options-fix: pack-objects: clamp negative depth to 0 t5316: check behavior of pack-objects --depth=0 pack-objects: clamp negative window size to 0 t5300: check that we produced expected number of deltas t5300: modernize basic tests	2021-05-11 15:27:23 +09:00
Junio C Hamano	270f8bfe00	Merge branch 'jk/doc-format-patch-skips-merges' Document that "format-patch" skips merges. * jk/doc-format-patch-skips-merges: docs/format-patch: mention handling of merges	2021-05-11 15:27:23 +09:00
Junio C Hamano	0b77301bf4	Merge branch 'jc/test-allows-local' Document that our test can use "local" keyword. * jc/test-allows-local: CodingGuidelines: explicitly allow "local" for test scripts	2021-05-11 15:27:22 +09:00
Junio C Hamano	74339f814c	Merge branch 'nc/submodule-update-quiet' "git submodule update --quiet" did not propagate the quiet option down to underlying "git fetch", which has been corrected. * nc/submodule-update-quiet: submodule update: silence underlying fetch with "--quiet"	2021-05-11 15:27:22 +09:00
Junio C Hamano	5feebddd86	Merge branch 'js/merge-already-up-to-date-message-reword' A few variants of informational message "Already up-to-date" has been rephrased. * js/merge-already-up-to-date-message-reword: merge: fix swapped "up to date" message components merge(s): apply consistent punctuation to "up to date" messages	2021-05-11 15:27:22 +09:00
Junio C Hamano	8ca4771dd0	Merge branch 'rj/bisect-skip-honor-terms' "git bisect skip" when custom words are used for new/old did not work, which has been corrected. * rj/bisect-skip-honor-terms: bisect--helper: use BISECT_TERMS in 'bisect skip' command	2021-05-11 15:27:22 +09:00
Will Chandler	5f03e5126d	refs: cleanup directories when deleting packed ref When deleting a packed ref via 'update-ref -d', a lockfile is made in the directory that would contain the loose copy of that ref, creating any directories in the ref's path that do not exist. When the transaction completes, the lockfile is deleted, but any empty parent directories made when creating the lockfile are left in place. These empty directories are not removed by 'pack-refs' or other housekeeping tasks and will accumulate over time. When deleting a loose ref, we remove all empty parent directories at the end of the transaction. This commit applies the parent directory cleanup logic used when deleting loose refs to packed refs as well. Signed-off-by: Will Chandler <wfc@wfchandler.org> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-11 13:59:57 +09:00
Alex Henrie	0e59f7ad67	merge-ort: split "distinct types" message into two translatable messages The word "renamed" has two possible translations in many European languages depending on whether one thing was renamed or two things were renamed. Give translators freedom to alter any part of the message to make it sound right in their language. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-11 12:26:01 +09:00
Junio C Hamano	49f38e2de4	The fifteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-10 16:59:47 +09:00
Junio C Hamano	a0f521b56c	Merge branch 'rs/repack-without-loosening-promised-objects' "git repack -A -d" in a partial clone unnecessarily loosened objects in promisor pack. * rs/repack-without-loosening-promised-objects: repack: avoid loosening promisor objects in partial clones	2021-05-10 16:59:47 +09:00
Junio C Hamano	44ccb7629a	Merge branch 'ls/subtree' "git subtree" updates. * ls/subtree: (30 commits) subtree: be stricter about validating flags subtree: push: allow specifying a local rev other than HEAD subtree: allow 'split' flags to be passed to 'push' subtree: allow --squash to be used with --rejoin subtree: give the docs a once-over subtree: have $indent actually affect indentation subtree: don't let debug and progress output clash subtree: add comments and sanity checks subtree: remove duplicate check subtree: parse revs in individual cmd_ functions subtree: use "^{commit}" instead of "^0" subtree: don't fuss with PATH subtree: use "$*" instead of "$@" as appropriate subtree: use more explicit variable names for cmdline args subtree: use git-sh-setup's `say` subtree: use `git merge-base --is-ancestor` subtree: drop support for git < 1.7 subtree: more consistent error propagation subtree: don't have loose code outside of a function subtree: t7900: add porcelain tests for 'pull' and 'push' ...	2021-05-10 16:59:47 +09:00
Junio C Hamano	aaa3c8065d	Merge branch 'bc/hash-transition-interop-part-1' SHA-256 transition. * bc/hash-transition-interop-part-1: hex: print objects using the hash algorithm member hex: default to the_hash_algo on zero algorithm value builtin/pack-objects: avoid using struct object_id for pack hash commit-graph: don't store file hashes as struct object_id builtin/show-index: set the algorithm for object IDs hash: provide per-algorithm null OIDs hash: set, copy, and use algo field in struct object_id builtin/pack-redundant: avoid casting buffers to struct object_id Use the final_oid_fn to finalize hashing of object IDs hash: add a function to finalize object IDs http-push: set algorithm when reading object ID Always use oidread to read into struct object_id hash: add an algo member to struct object_id	2021-05-10 16:59:46 +09:00
Đoàn Trần Công Danh	59b519ab7e	am: learn to process quoted lines that ends with CRLF In previous changes, mailinfo has learnt to process lines that decoded from base64 or quoted-printable, and ends with CRLF. Let's teach "am" that new trick, too. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-10 15:06:22 +09:00
Đoàn Trần Công Danh	133a4fda59	mailinfo: allow stripping quoted CR without warning In previous changes, we've turned on warning for quoted CR in base64 or quoted-printable email messages. Some projects see those quoted CR a lot, they know that it happens most of the time, and they find it's desirable to always strip those CR. Those projects in question usually fall back to use other tools to handle patches when receive such patches. Let's help those projects handle those patches by stripping those excessive CR. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-10 15:06:22 +09:00
Đoàn Trần Công Danh	f1aa299443	mailinfo: allow squelching quoted CRLF warning In previous change, Git starts to warn for quoted CRLF in decoded base64/QP email. Despite those warnings are usually helpful, quoted CRLF could be part of some users' workflow. Let's give them an option to turn off the warning completely. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-10 15:06:22 +09:00
Đoàn Trần Công Danh	0b689562ca	mailinfo: warn if CRLF found in decoded base64/QP email When SMTP servers receive 8-bit email messages, possibly with only LF as line ending, some of them decide to change said LF to CRLF. Some mailing list softwares, when receive 8-bit email messages, decide to encode those messages in base64 or quoted-printable. If an email is transfered through above mail servers, then distributed by such mailing list softwares, the recipients will receive an email contains a patch mungled with CRLF encoded inside another encoding. Thus, such CR (in CRLF) couldn't be dropped by "mailsplit". Hence, the mailed patch couldn't be applied cleanly. Such accidents have been observed in the wild [1]. Instead of silently rejecting those messages, let's give our users some warnings if such CR (as part of CRLF) is found. [1]: https://nmbug.notmuchmail.org/nmweb/show/m2lf9ejegj.fsf%40guru.guru-group.fi Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-10 15:06:22 +09:00
Martin Ågren	8c9ca6f095	pretty-formats.txt: add missing space The description of "%ch" is missing a space after "human style", before the parenthetical remark. This description was introduced in `b722d4560e` ("pretty: provide human date format", 2021-04-25). That commit also added "%ah", which does have the space already. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-10 14:12:49 +09:00
Martin Ågren	fba8e4c3d0	git-repack.txt: remove spurious ")" Drop the ")" at the end of this paragraph. There's a parenthetical remark in this paragraph, but it's been closed on the line above. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-10 14:12:47 +09:00
Junio C Hamano	2d677e5b15	The fourteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-07 12:47:42 +09:00
Junio C Hamano	39c5392d68	Merge branch 'll/clone-reject-shallow' Fix tests when forced to use v0 protocol. * ll/clone-reject-shallow: t5601: mark protocol v2-only test	2021-05-07 12:47:42 +09:00
Junio C Hamano	70a890d42f	Merge branch 'si/zsh-complete-comment-fix' Portability fix for command line completion script (in contrib/). * si/zsh-complete-comment-fix: work around zsh comment in __git_complete_worktree_paths	2021-05-07 12:47:42 +09:00
Junio C Hamano	18e1ba1092	Merge branch 'dl/complete-stash-updates' Further update the command line completion (in contrib/) for "git stash". * dl/complete-stash-updates: git-completion.bash: consolidate cases in _git_stash() git-completion.bash: use $__git_cmd_idx in more places git-completion.bash: rename to $__git_cmd_idx git-completion.bash: separate some commands onto their own line	2021-05-07 12:47:41 +09:00
Junio C Hamano	848a17c274	Merge branch 'dl/complete-stash' The command line completion (in contrib/) for "git stash" has been updated. * dl/complete-stash: git-completion.bash: use __gitcomp_builtin() in _git_stash() git-completion.bash: extract from else in _git_stash() git-completion.bash: pass $__git_subcommand_idx from __git_main()	2021-05-07 12:47:41 +09:00
Junio C Hamano	936e58851a	Merge branch 'ah/plugleaks' Plug various leans reported by LSAN. * ah/plugleaks: builtin/rm: avoid leaking pathspec and seen builtin/rebase: release git_format_patch_opt too builtin/for-each-ref: free filter and UNLEAK sorting. mailinfo: also free strbuf lists when clearing mailinfo builtin/checkout: clear pending objects after diffing builtin/check-ignore: clear_pathspec before returning builtin/bugreport: don't leak prefixed filename branch: FREE_AND_NULL instead of NULL'ing real_ref bloom: clear each bloom_key after use ls-files: free max_prefix when done wt-status: fix multiple small leaks revision: free remainder of old commit list in limit_list	2021-05-07 12:47:41 +09:00
Junio C Hamano	8585d6c04a	Merge branch 'ps/rev-list-object-type-filter' "git rev-list" learns the "--filter=object:type=<type>" option, which can be used to exclude objects of the given kind from the packfile generated by pack-objects. * ps/rev-list-object-type-filter: rev-list: allow filtering of provided items pack-bitmap: implement combined filter pack-bitmap: implement object type filter list-objects: implement object type filter list-objects: support filtering by tag and commit list-objects: move tag processing into its own function revision: mark commit parents as NOT_USER_GIVEN uploadpack.txt: document implication of `uploadpackfilter.allow`	2021-05-07 12:47:41 +09:00
Junio C Hamano	826ef0e5e5	Merge branch 'ab/svn-tests-set-e-fix' Test clean-up. * ab/svn-tests-set-e-fix: svn tests: refactor away a "set -e" in test body svn tests: remove legacy re-setup from init-clone test	2021-05-07 12:47:40 +09:00
Junio C Hamano	0377ac98dc	Merge branch 'ab/rebase-no-reschedule-failed-exec' "git rebase --[no-]reschedule-failed-exec" did not work well with its configuration variable, which has been corrected. * ab/rebase-no-reschedule-failed-exec: rebase: don't override --no-reschedule-failed-exec with config rebase tests: camel-case rebase.rescheduleFailedExec consistently	2021-05-07 12:47:40 +09:00
Junio C Hamano	5a357fa477	Merge branch 'ab/doc-lint' Dev support. * ab/doc-lint: docs: fix linting issues due to incorrect relative section order doc lint: lint relative section order doc lint: lint and fix missing "GIT" end sections doc lint: fix bugs in, simplify and improve lint script doc lint: Perl "strict" and "warnings" in lint-gitlink.perl Documentation/Makefile: make doc.dep dependencies a variable again Documentation/Makefile: make $(wildcard howto/*.txt) a var	2021-05-07 12:47:40 +09:00
Junio C Hamano	fe069dce62	Merge branch 'mt/add-rm-in-sparse-checkout' "git add" and "git rm" learned not to touch those paths that are outside of sparse checkout. * mt/add-rm-in-sparse-checkout: rm: honor sparse checkout patterns add: warn when asked to update SKIP_WORKTREE entries refresh_index(): add flag to ignore SKIP_WORKTREE entries pathspec: allow to ignore SKIP_WORKTREE entries on index matching add: make --chmod and --renormalize honor sparse checkouts t3705: add tests for `git add` in sparse checkouts add: include magic part of pathspec on --refresh error	2021-05-07 12:47:40 +09:00
Junio C Hamano	e706aaf3bc	Merge branch 'ps/config-global-override' Replace GIT_CONFIG_NOSYSTEM mechanism to decline from reading the system-wide configuration file with GIT_CONFIG_SYSTEM that lets users specify from which file to read the system-wide configuration (setting it to an empty file would essentially be the same as setting NOSYSTEM), and introduce GIT_CONFIG_GLOBAL to override the per-user configuration in $HOME/.gitconfig. * ps/config-global-override: t1300: fix unset of GIT_CONFIG_NOSYSTEM leaking into subsequent tests config: allow overriding of global and system configuration config: unify code paths to get global config paths config: rename `git_etc_config()`	2021-05-07 12:47:39 +09:00
Junio C Hamano	f16a4660de	Merge branch 'zh/pretty-date-human' "git log --format=..." placeholders learned %ah/%ch placeholders to request the --date=human output. * zh/pretty-date-human: pretty: provide human date format	2021-05-07 12:47:39 +09:00
Junio C Hamano	c108c8c2f2	Merge branch 'zh/format-ref-array-optim' "git (branch\|tag) --format=..." has been micro-optimized. * zh/format-ref-array-optim: ref-filter: reuse output buffer ref-filter: get rid of show_ref_array_item	2021-05-07 12:47:39 +09:00
Junio C Hamano	bb2feec17f	Merge branch 'ad/cygwin-no-backslashes-in-paths' Cygwin pathname handling fix. * ad/cygwin-no-backslashes-in-paths: cygwin: disallow backslashes in file names	2021-05-07 12:47:39 +09:00
Junio C Hamano	6d99f31dda	Merge branch 'jz/apply-3way-first-message-fix' When we swapped the order of --3way fallback, we forgot to adjust the message we give when the first method fails and the second method is attempted (which used to be "direct application failed hence we try 3way", now it is the other way around). * jz/apply-3way-first-message-fix: apply: adjust messages to account for --3way changes	2021-05-07 12:47:38 +09:00
Junio C Hamano	6e08cbdf38	Merge branch 'jk/prune-with-bitmap-fix' When the reachability bitmap is in effect, the "do not lose recently created objects and those that are reachable from them" safety to protect us from races were disabled by mistake, which has been corrected. * jk/prune-with-bitmap-fix: prune: save reachable-from-recent objects with bitmaps pack-bitmap: clean up include_check after use	2021-05-07 12:47:38 +09:00
Junio C Hamano	e60e9cc20e	Merge branch 'po/diff-patch-doc' Doc update. * po/diff-patch-doc: doc: point to diff attribute in patch format docs	2021-05-07 12:47:38 +09:00
Junio C Hamano	a850356d1b	Merge branch 'hn/trace-reflog-expiry' The reflog expiry machinery has been taught to emit trace events. * hn/trace-reflog-expiry: refs/debug: trace into reflog expiry too	2021-05-07 12:47:38 +09:00
Junio C Hamano	e5d99d378b	Merge branch 'ab/pretty-date-format-tests' Tweak a few tests for "log --format=..." that show timestamps in various formats. * ab/pretty-date-format-tests: pretty tests: give --date/format tests a better description pretty tests: simplify %aI/%cI date format test	2021-05-07 12:47:38 +09:00
Junio C Hamano	5f586f55a0	Merge branch 'ps/config-env-option-with-separate-value' "git --config-env var=val cmd" weren't accepted (only --config-env=var=val was). * ps/config-env-option-with-separate-value: git: support separate arg for `--config-env`'s value git.txt: fix synopsis of `--config-env` missing the equals sign	2021-05-07 12:47:37 +09:00
Matheus Tavares	3a7f0908b6	clean: remove unnecessary variable The variable `matches` used to hold the return of a `dir_path_match()` call that was removed in `95c11ecc73` ("Fix error-prone fill_directory() API; make it only return matches", 2020-04-01). Now `matches` will always hold 0, which is the value it's initialized with; and the condition `matches != MATCHED_EXACTLY` will always evaluate to true. So let's remove this unnecessary variable. Interestingly, it seems that `matches != MATCHED_EXACTLY` was already unnecessary before `95c11ecc73`. That's because `remove_directories` is always set to 1 when we have pathspecs; So, in the condition `!remove_directories && matches != MATCHED_EXACTLY`, we would either: - have pathspecs (or have been given `-d`) and ignore `matches` because `remove_directories` is 1; or - not have pathspecs (nor `-d`) and end up just checking that `0 != MATCHED_EXACTLY`, as `matches` would never get reassigned after its zero initialization (because there is no pathspec to match). Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-07 07:48:11 +09:00
Đoàn Trần Công Danh	dd9323b7fb	mailinfo: stop parsing options manually In a later change, mailinfo will learn more options, let's switch to our robust parse_options framework before that step. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-07 06:40:26 +09:00
Đoàn Trần Công Danh	d582992e80	mailinfo: load default metainfo_charset lazily In a later change, we will use parse_option to parse mailinfo's options. In mailinfo, both "-u", "-n", and "--encoding" try to set the same field, with "-u" reset that field to some default value from configuration variable "i18n.commitEncoding". Let's delay the setting of that field until we finish processing all options. By doing that, "i18n.commitEncoding" can be parsed on demand. More importantly, it cleans the way for using parse_option. This change introduces some inconsistent brackets "{}" in "if/else if" construct, however, we will rewrite them in the next few changes. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-07 06:40:25 +09:00
Øystein Walle	a1989cf7b8	add: die if both --dry-run and --interactive are given The interactive machinery does not obey --dry-run. Die appropriately if both flags are passed. Signed-off-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-07 06:14:04 +09:00
Ævar Arnfjörð Bjarmason	d4e2d15a8b	streaming.c: move {open,close,read} from vtable to "struct git_istream" Move the definition of the structure around the open/close/read functions introduced in `46bf043807` (streaming: a new API to read from the object store, 2011-05-11) to instead populate "close" and "read" members in the "struct git_istream". This gets us rid of an extra pointer deference, and I think makes more sense. The "close" and "read" functions are the primary interface to the stream itself. Let's also populate a "open" callback in the same struct. That's now used by open_istream() after istream_source() decides what "open" function should be used. This isn't needed to get rid of the "stream_vtbl" variables, but makes sense for consistency. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-06 12:56:10 +09:00
Ævar Arnfjörð Bjarmason	de94c0eace	streaming.c: stop passing around "object_info *" to open() Change the streaming interface to stop passing around the "struct object_info" the open() functions. As seen in `7ef2d9a260` (streaming: read non-delta incrementally from a pack, 2011-05-13) which introduced the "st->u.in_pack" assignments being changed here only the open_istream_pack_non_delta() path need these. So let's instead do this when preparing the selected callback in the istream_source() function. This might also allow the compiler to reduce the lifetime of the "oi" variable, as we've moved it from "git_istream()" to "istream_source()". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-06 12:56:09 +09:00
Ævar Arnfjörð Bjarmason	bc062ad001	streaming.c: remove {open,close,read}_method_decl() macros Remove the {open,close,read}_method_decl() macros added in `46bf043807` (streaming: a new API to read from the object store, 2011-05-11) in favor of inlining the definition of the arguments of these functions. Since we'll end up using them via the "{open,close,read}_istream_fn" types we don't gain anything in the way of compiler checking by using these macros, and as of preceding commits we no longer need to declare these argument lists twice. So declaring them at a distance just serves to make the code less readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-06 12:56:06 +09:00
Ævar Arnfjörð Bjarmason	0d9af06e36	streaming.c: remove enum/function/vtbl indirection Remove the indirection of discovering a function pointer to use via an enum and virtual table. This refactors code added in `46bf043807` (streaming: a new API to read from the object store, 2011-05-11). We can instead simply return an "open_istream_fn" for use from the "istream_source()" selector function directly. This allows us to get rid of the "incore", "loose" and "pack_non_delta" enum variables. We'll return the functions instead. The "stream_error" variable in that enum can likewise go in favor of returning NULL, which is what the open_istream() was doing when it got that value anyway. We can thus remove the entire enum, and the "open_istream_tbl" virtual table that (indirectly) referenced it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-06 12:56:04 +09:00
Ævar Arnfjörð Bjarmason	b65528360f	streaming.c: avoid forward declarations Change code added in `46bf043807` (streaming: a new API to read from the object store, 2011-05-11) to avoid forward declarations of the functions it uses. We can instead move this code to the bottom of the file, and thus avoid the open_method_decl() calls. Aside from the addition of the "static helpers[...]" comment being added here, and the removal of the forward declarations this is a move-only change. The style of the added "static helpers[...]" comment isn't in line with our usual coding style, but is consistent with several other comments used in this file, so let's use that style consistently here. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-06 12:56:02 +09:00
Ævar Arnfjörð Bjarmason	b79f9c075d	sparse-index.c: remove set_index_sparse_config() Remove the set_index_sparse_config() function by folding it into set_sparse_index_config(), which was its only user. Since `122ba1f7b5` (sparse-checkout: toggle sparse index from builtin, 2021-03-30) the flow of this code hasn't made much sense, we'd get "enabled" in set_sparse_index_config(), proceed to call set_index_sparse_config() with it. There we'd call prepare_repo_settings() and set "repo->settings.sparse_index = 1", only to needlessly call prepare_repo_settings() again in set_sparse_index_config() (where it would early abort), and finally setting "repo->settings.sparse_index = enabled". Instead we can just call prepare_repo_settings() once, and set the variable to "enabled" in the first place. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-06 12:53:46 +09:00
Joachim Kuebart	6b79818bfb	git-p4: speed up search for branch parent For every new branch that git-p4 imports, it needs to find the commit where it branched off its parent branch. While p4 doesn't record this information explicitly, the first changelist on a branch is usually an identical copy of the parent branch. The method searchParent() tries to find a commit in the history of the given "parent" branch whose tree exactly matches the initial changelist of the new branch, "target". The code iterates through the parent commits and compares each of them to this initial changelist using diff-tree. Since we already know the tree object name we are looking for, spawning diff-tree for each commit is wasteful. Use the "--format" option of "rev-list" to find out the tree object name of each commit in the history, and find the tree whose name is exactly the same as the tree of the target commit to optimize this. This results in a considerable speed-up, at least on Windows. On one Windows machine with a fairly large repository of about 16000 commits in the parent branch, the current code takes over 7 minutes, while the new code only takes just over 10 seconds for the same changelist: Before: $ time git p4 sync Importing from/into multiple branches Depot paths: //depot Importing revision 31274 (100.0%) Updated branches: b1 real 7m41.458s user 0m0.000s sys 0m0.077s After: $ time git p4 sync Importing from/into multiple branches Depot paths: //depot Importing revision 31274 (100.0%) Updated branches: b1 real 0m10.235s user 0m0.000s sys 0m0.062s Signed-off-by: Joachim Kuebart <joachim.kuebart@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Luke Diamand <luke@diamand.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-06 12:51:33 +09:00
Joachim Kuebart	c3ab08844c	git-p4: ensure complex branches are cloned correctly When importing a branch from p4, git-p4 searches the history of the parent branch for the branch point. The test for the complex branch structure ensures all files have the expected contents, but doesn't examine the branch structure. Check for the correct branch structure by making sure that the initial commit on each branch is empty. This ensures that the initial commit's parent is indeed the correct branch-off point. Signed-off-by: Joachim Kuebart <joachim.kuebart@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-06 12:51:31 +09:00
Phillip Wood	f91371b948	patience diff: remove unused variable Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 18:56:48 +09:00
Phillip Wood	204aa2d24d	patience diff: remove unnecessary string comparisons xdl_prepare_env() calls xdl_classify_record() which arranges for the hashes of non-matching lines to be different so lines can be tested for equality by comparing just their hashes. This reduces the time taken to calculate the diff of v2.28.0 to v2.29.0 by ~3-4%. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 18:56:48 +09:00
Phillip Wood	0324e8fc6b	word diff: handle zero length matches If find_word_boundaries() encounters a zero length match (which can be caused by matching a newline or using '' instead of '+' in the regex) we stop splitting the input into words which generates an inaccurate diff. To fix this increment the start point when there is a zero length match and try a new match. This is safe as posix regular expressions always return the longest available match so a zero length match means there are no longer matches available from the current position. Commit `bf82940dbf` (color-words: enable REG_NEWLINE to help user, 2009-01-17) prevented matching newlines in negated character classes but it is still possible for the user to have an explicit newline match in the regex which could cause a zero length match. One could argue that having explicit newline matches or using '' rather than '+' are user errors but it seems to be better to work round them than produce inaccurate diffs. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 18:53:42 +09:00
Matheus Tavares	87094fc2da	ci: run test round with parallel-checkout enabled We already have tests for the basic parallel-checkout operations. But this code can also run be executed by other commands, such as git-read-tree and git-sparse-checkout, which are currently not tested with multiple workers. To promote a wider test coverage without duplicating tests: 1. Add the GIT_TEST_CHECKOUT_WORKERS environment variable, to optionally force parallel-checkout execution during the whole test suite. 2. Set this variable (with a value of 2) in the second test round of our linux-gcc CI job. This round runs `make test` again with some optional GIT_TEST_* variables enabled, so there is no additional overhead in exercising the parallel-checkout code here. Note that tests checking out less than two parallel-eligible entries will fall back to the sequential mode. Nevertheless, it's still a good exercise for the parallel-checkout framework as the fallback codepath also writes the queued entries using the parallel-checkout functions (only without spawning any worker). Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 12:27:17 +09:00
Matheus Tavares	d5904220bc	parallel-checkout: add tests related to .gitattributes Add tests to confirm that the `struct conv_attrs` data is correctly passed from the main process to the workers, and that they can properly convert the blobs before writing them to the working tree. Also check that parallel-ineligible entries, such as regular files that require external filters, are correctly smudge and written when parallel-checkout is enabled. Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 12:26:36 +09:00
Matheus Tavares	2fa3cbadcd	t0028: extract encoding helpers to lib-encoding.sh The following patch will add tests outside t0028 which will also need to re-encode some strings. Extract the auxiliary encoding functions from t0028 to a common lib file so that they can be reused. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 12:26:36 +09:00
Matheus Tavares	6a7bc9d118	parallel-checkout: add tests related to path collisions Add tests to confirm that path collisions are properly detected by checkout workers, both to avoid race conditions and to report colliding entries on clone. Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 12:26:36 +09:00
Matheus Tavares	d0e5d35700	parallel-checkout: add tests for basic operations Add tests to populate the working tree during clone and checkout using sequential and parallel mode, to confirm that they produce identical results. Also test basic checkout mechanics, such as checking for symlinks in the leading directories and the abidance to --force. Note: some helper functions are added to a common lib file which is only included by t2080 for now. But they will also be used by other parallel-checkout tests in the following patches. Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 12:26:36 +09:00
Matheus Tavares	70b052b209	checkout-index: add parallel checkout support Allow checkout-index to use the parallel checkout framework, honoring the checkout.workers configuration. There are two code paths in checkout-index which call `checkout_entry()`, and thus, can make use of parallel checkout: `checkout_file()`, which is used to write paths explicitly given at the command line; and `checkout_all()`, which is used to write all paths in the index, when the `--all` option is given. In both operation modes, checkout-index doesn't abort immediately on a `checkout_entry()` failure. Instead, it tries to check out all remaining paths before exiting with a non-zero exit code. To keep this behavior when parallel checkout is being used, we must allow `run_parallel_checkout()` to try writing the queued entries before we exit, even if we already got an error code from a previous `checkout_entry()` call. However, `checkout_all()` doesn't return on errors, it calls `exit()` with code 128. We could make it call `run_parallel_checkout()` before exiting, but it makes the code easier to follow if we unify the exit path for both checkout-index modes at `cmd_checkout_index()`, and let this function take care of the interactions with the parallel checkout API. So let's do that. With this change, we also have to consider whether we want to keep using 128 as the error code for `git checkout-index --all`, while we use 1 for `git checkout-index <path>` (even when the actual error is the same). Since there is not much value in having code 128 only for `--all`, and there is no mention about it in the docs (so it's unlikely that changing it will break any existing script), let's make both modes exit with code 1 on `checkout_entry()` errors. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 12:26:36 +09:00
Matheus Tavares	6053950632	builtin/checkout.c: complete parallel checkout support Pathspec-limited checkouts (like `git checkout *.txt`) are performed by a code path that doesn't yet support parallel checkout because it calls checkout_entry() directly, instead of unpack_trees(). Let's add parallel checkout support for this code path too. The transient cache entries allocated in checkout_merged() are now allocated in a mem_pool which is only discarded after parallel checkout finishes. This is done because the entries need to be valid when run_parallel_checkout() is called. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 12:26:33 +09:00
Matheus Tavares	9616882780	make_transient_cache_entry(): optionally alloc from mem_pool Allow make_transient_cache_entry() to optionally receive a mem_pool struct in which it should allocate the entry. This will be used in the following patch, to store some transient entries which should persist until parallel checkout finishes. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 12:25:25 +09:00
Jonathan Tan	b89c731228	t5601: mark protocol v2-only test A HTTP-clone test introduced in `4fe788b1b0` ("builtin/clone.c: add --reject-shallow option", 2021-04-01) only works in protocol v2, but is not marked as such. The aforementioned patch implements --reject-shallow for a variety of situations, but usage of a protocol that requires a remote helper is not one of them. (Such an implementation would require extending the remote helper protocol to support the passing of a "reject shallow" option, and then teaching it to both protocol-speaking ends.) For now, to make it pass when GIT_TEST_PROTOCOL_VERSION=0 is passed, add "-c protocol.version=2". A more complete solution would be either to augment the remote helper protocol to support this feature or to return a fatal error when using --reject-shallow with a protocol that uses a remote helper. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 10:54:41 +09:00
Jonathan Tan	477673d6f3	send-pack: support push negotiation Teach Git the push.negotiate config variable. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 10:41:29 +09:00
Jonathan Tan	9c1e657a8f	fetch: teach independent negotiation (no packfile) Currently, the packfile negotiation step within a Git fetch cannot be done independent of sending the packfile, even though there is at least one application wherein this is useful. Therefore, make it possible for this negotiation step to be done independently. A subsequent commit will use this for one such application - push negotiation. This feature is for protocol v2 only. (An implementation for protocol v0 would require a separate implementation in the fetch, transport, and transport helper code.) In the protocol, the main hindrance towards independent negotiation is that the server can unilaterally decide to send the packfile. This is solved by a "wait-for-done" argument: the server will then wait for the client to say "done". In practice, the client will never say it; instead it will cease requests once it is satisfied. In the client, the main change lies in the transport and transport helper code. fetch_refs_via_pack() performs everything needed - protocol version and capability checks, and the negotiation itself. There are 2 code paths that do not go through fetch_refs_via_pack() that needed to be individually excluded: the bundle transport (excluded through requiring smart_options, which the bundle transport doesn't support) and transport helpers that do not support takeover. If or when we support independent negotiation for protocol v0, we will need to modify these 2 code paths to support it. But for now, report failure if independent negotiation is requested in these cases. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 10:41:29 +09:00
Sardorbek Imomaliev	f2acf763e2	work around zsh comment in __git_complete_worktree_paths [PATCH]: contrib/completion/git-completion.bash, there is a construct where comment lines are placed between the command that is on the upstream of a pipe and the command that is on the downstream of a pipe in __git_complete_worktree_paths function. Unfortunately, this script is also used by Zsh completion, but Zsh mishandles this construct when "interactive_comments" option is not set (by default it is off on macOS), resulting in a breakage: $ git worktree remove [TAB] $ git worktree remove __git_complete_worktree_paths:7: command not found: # Move the comment, even though it explains what happens on the downstream of the pipe and logically belongs where it is right now, before the entire pipeline, to work around this problem. Signed-off-by: Sardorbek Imomaliev <sardorbek.imomaliev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 12:17:23 +09:00
ZheNing Hu	c364b7ef51	trailer: add new .cmd config option The `trailer.<token>.command` configuration variable specifies a command (run via the shell, so it does not have to be a single name or path to the command, but can be a shell script), and the first occurrence of substring $ARG is replaced with the value given to the `interpret-trailer` command for the token in a '--trailer <token>=<value>' argument. This has three downsides: * The use of $ARG in the mechanism misleads the users that the value is passed in the shell variable, and tempt them to use $ARG more than once, but that would not work, as the second and subsequent $ARG are not replaced. * Because $ARG is textually replaced without regard to the shell language syntax, even '$ARG' (inside a single-quote pair), which a user would expect to stay intact, would be replaced, and worse, if the value had an unmatched single quote (imagine a name like "O'Connor", substituted into NAME='$ARG' to make it NAME='O'Connor'), it would result in a broken command that is not syntactically correct (or worse). * The first occurrence of substring `$ARG` will be replaced with the empty string, in the command when the command is first called to add a trailer with the specified <token>. This is a bad design, the nature of automatic execution causes it to add a trailer that we don't expect. Introduce a new `trailer.<token>.cmd` configuration that takes higher precedence to deprecate and eventually remove `trailer.<token>.command`, which passes the value as an argument to the command. Instead of "$ARG", users can refer to the value as positional argument, $1, in their scripts. At the same time, in order to allow `git interpret-trailers` to better simulate the behavior of `git command -s`, 'trailer.<token>.cmd' will not automatically execute. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 12:09:43 +09:00
ZheNing Hu	57dcb6575b	docs: correct descript of trailer.<token>.command In the original documentation of `trailer.<token>.command`, some descriptions are easily misunderstood. So let's modify it to increase its readability. In addition, clarify that `$ARG` in command can only be replaced once. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 12:09:43 +09:00
Jeff King	8ff06de10c	docs: document symlink restrictions for dot-files We stopped allowing symlinks for .gitmodules files in `10ecfa7649` (verify_path: disallow symlinks in .gitmodules, 2018-05-04), and we stopped following symlinks for .gitattributes, .gitignore, and .mailmap in the commits from `204333b015` (Merge branch 'jk/open-dotgitx-with-nofollow', 2021-03-22). The reasons are discussed in detail there, but we never adjusted the documentation to let users know. This hasn't been a big deal since the point is that such setups were mildly broken and thought to be unusual anyway. But it certainly doesn't hurt to be clear and explicit about it. Suggested-by: Philip Oakley <philipoakley@iee.email> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 11:52:03 +09:00
Jeff King	bb6832d552	fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW In the commits merged in via `204333b015` (Merge branch 'jk/open-dotgitx-with-nofollow', 2021-03-22), we stopped following symbolic links for .gitattributes, .gitignore, and .mailmap files. Let's teach fsck to warn that these symlinks are not going to do anything. Note that this is just a warning, and won't block the objects via transfer.fsckObjects, since there are reported to be cases of this in the wild (and even once fixed, they will continue to exist in the commit history of those projects, but are not particularly dangerous). Note that we won't add these to the existing gitmodules block in the fsck code. The logic for gitmodules is a bit more complicated, as we also check the content of non-symlink instances we find. But for these new files, there is no content check; we're just looking at the name and mode of the tree entry (and we can avoid even the complicated name checks in the common case that the mode doesn't indicate a symlink). We can reuse the test helper function we defined for .gitmodules, though (it needs some slight adjustments for the fsck error code, and because we don't block these symlinks via verify_path()). Note that I didn't explicitly test the transfer.fsckObjects case here (nor does the existing .gitmodules test that it blocks a push). The translation of fsck severities to outcomes is covered in general in t5504. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 11:52:02 +09:00
Jeff King	801ed010bf	t0060: test ntfs/hfs-obscured dotfiles We have tests that cover various filesystem-specific spellings of ".gitmodules", because we need to reliably identify that path for some security checks. These are from `dc2d9ba318` (is_{hfs,ntfs}_dotgitmodules: add tests, 2018-05-12), with the actual code coming from `e7cb0b4455` (is_ntfs_dotgit: match other .git files, 2018-05-11) and `0fc333ba20` (is_hfs_dotgit: match other .git files, 2018-05-02). Those latter two commits also added similar matching functions for .gitattributes and .gitignore. These ended up not being used in the final series, and are currently dead code. But in preparation for them being used in some fsck checks, let's make sure they actually work by throwing a few basic tests at them. Likewise, let's cover .mailmap (which does need matching code added). I didn't bother with the whole battery of tests that we cover for .gitmodules. These functions are all based on the same generic matcher, so it's sufficient to test most of the corner cases just once. Note that the ntfs magic prefix names in the tests come from the algorithm described in `e7cb0b4455` (and are different for each file). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 11:52:02 +09:00
Jeff King	1cb12f3339	t7450: test .gitmodules symlink matching against obscured names In t7450 we check that both verify_path() and fsck catch malformed .gitmodules entries in trees. However, we don't check that we catch filesystem-equivalent forms of these (e.g., ".GITMOD~1" on Windows). Our name-matching functions are exercised well in t0060, but there's nothing to test that we correctly call the matching functions from the actual fsck and verify_path() code. So instead of testing just .gitmodules, let's repeat our tests for a few basic cases. We don't need to be exhaustive here (t0060 handles that), but just make sure we hit one name of each type. Besides pushing the tests into a function that takes the path as a parameter, we'll need to do a few things: - adjust the directory name to accommodate the tests running multiple times - set core.protecthfs for index checks. Fsck always protects all types by default, but we want to be able to exercise the HFS routines on every system. Note that core.protectntfs is already the default these days, but it doesn't hurt to explicitly label our need for it. - we'll also take the filename ("gitmodules") as a parameter. All calls use the same name for now, but a future patch will extend this to handle other .gitfoo files. Note that our fake-content symlink destination is somewhat .gitmodules specific. But it isn't necessary for other files (which don't do a content check). And it happens to be a valid attribute and ignore file anyway. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 11:52:02 +09:00
Jeff King	a1ca398ba7	t7450: test verify_path() handling of gitmodules Commit `10ecfa7649` (verify_path: disallow symlinks in .gitmodules, 2018-05-04) made it impossible to load a symlink .gitmodules file into the index. However, there are no tests of this behavior. Let's make sure this case is covered. We can easily reuse the test setup created by the matching `b7b1fca175` (fsck: complain when .gitmodules is a symlink, 2018-05-04). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 11:52:02 +09:00
Jeff King	43a2220f19	t7415: rename to expand scope This script has already expanded beyond its original intent of ".. in submodule names" to include other malicious submodule bits. Let's update the name and description to reflect that, as well as the fact that we'll soon be adding similar tests for other dotfiles (.gitattributes, etc). We'll also renumber it to move it out of the group of submodule-specific tests. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:41:08 +09:00
Jeff King	0282f6799f	fsck_tree(): wrap some long lines Many calls to report() in fsck_tree() are kept on a single line and are quite long. Most were pretty big to begin with, but have gotten even longer over the years as we've added more parameters. Let's accept the churn of wrapping them in order to conform to our usual line limits. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:41:08 +09:00
Jeff King	9e1947cb48	fsck_tree(): fix shadowed variable Commit `b2f2039c2b` (fsck: accept an oid instead of a "struct tree" for fsck_tree(), 2019-10-18) introduced a new "oid" parameter to fsck_tree(), and we pass it to the report() function when we find problems. However, that is shadowed within the tree-walking loop by the existing "oid" variable which we use to store the oid of each tree entry. As a result, we may report the wrong oid for some problems we detect within the loop (the entry oid, instead of the tree oid). Our tests didn't catch this because they checked only that we found the expected fsck problem, not that it was attached to the correct object. Let's rename both variables in the function to avoid confusion. This makes the diff a little noisy (e.g., all of the report() calls outside the loop were already correct but need to be touched), but makes sure we catch all cases and will avoid similar confusion in the future. And we can update the test to be a bit more specific and catch this problem. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:41:08 +09:00
Jeff King	963d02a24a	t7415: remove out-dated comment about translation Since GETTEXT_POISON does not exist anymore, there is no point warning people about whether we should use test_i18ngrep. This is doubly confusing because the comment was describing why it was OK to use grep, but it got caught up in the mass conversion of `674ba34038` (fsck: mark strings for translation, 2018-11-10). Note there are other uses of test_i18ngrep in this script which are now obsolete; I'll save those for a mass-cleanup. My goal here was just to fix the confusing comment in code I'm about to refactor. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:41:08 +09:00
Jeff King	8e0601f568	docs/format-patch: mention handling of merges Format-patch doesn't have a way to format merges in a way that can be applied by git-am (or any other tool), and so it just omits them. However, this may be a surprising implication for users who are not well versed in how the tool works. Let's add a note to the documentation making this more clear. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:32:39 +09:00
Jeff King	6d52b6a5df	pack-objects: clamp negative depth to 0 A negative delta depth makes no sense, and the code is not prepared to handle it. If passed "--depth=-1" on the command line, then this line from break_delta_chains(): cur->depth = (total_depth--) % (depth + 1); triggers a divide-by-zero. This is undefined behavior according to the C standard, but on POSIX systems results in SIGFPE killing the process. This is certainly one way to inform the use that the command was invalid, but it's a bit friendlier to just treat it as "don't allow any deltas", which we already do for --depth=0. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:30:46 +09:00
Jeff King	49ac1d33bb	t5316: check behavior of pack-objects --depth=0 We'd expect this to cleanly produce no deltas at all (as opposed to getting confused by an out-of-bounds value), and it does. Note we have to adjust our max_chain test helper, which expected to find at least one delta. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:29:56 +09:00
Jeff King	953aa54e1a	pack-objects: clamp negative window size to 0 A negative window size makes no sense, and the code in find_deltas() is not prepared to handle it. If you pass "-1", for example, we end up generate a 0-length array of "struct unpacked", but our loop assumes it has at least one entry in it (and we end up reading garbage memory). We could complain to the user about this, but it's more forgiving to just clamp it to 0, which means "do not find any deltas at all". The 0-case is already tested earlier in the script, so we'll make sure this does the same thing. Reported-by: Yiyuan guo <yguoaz@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:29:27 +09:00
Jeff King	95356789ee	t5300: check that we produced expected number of deltas We pack a set of objects both with and without --window=0, assuming that the 0-length window will cause us not to produce any deltas. Let's confirm that this is the case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:29:16 +09:00
Jeff King	5489899812	t5300: modernize basic tests The first set of tests in t5300 goes back to 2005, and doesn't use some of our customary style and tools these days. In preparation for touching them, let's modernize a few things: - titles go on the line with test_expect_success, with a hanging open-quote to start the test body - test bodies should be indented with tabs - opening braces for shell blocks in &&-chains go on their own line - no space between redirect operators and files (">foo", not "> foo") - avoid doing work outside of test blocks; in this case, we can stick the setup of ".git2" into the appropriate blocks - avoid modifying and then cleaning up the environment or current directory by using subshells and "git -C" - this test does a curious thing when testing the unpacking: it sets GIT_OBJECT_DIRECTORY, and then does a "git init" in the _original_ directory, creating a weird mixed situation. Instead, it's much simpler to just "git init --bare" a new repository to unpack into, and check the results there. I renamed this "git2" instead of ".git2" to make it more clear it's a separate repo. - we can observe that the bodies of the no-delta, ref_delta, and ofs_delta cases are all virtually identical except for the pack creation, and factor out shared helper functions. I collapsed "do the unpack" and "check the results of the unpack" into a single test, since that makes the expected lifetime of the "git2" temporary directory more clear (that also lets us use test_when_finished to clean it up). This does make the "-v" output slightly less useful, but the improvement in reading the actual test code makes it worth it. - I dropped the "pwd" calls from some tests. These don't do anything functional, and I suspect may have been an aid for debugging when the script was more cavalier about leaving the working directory changed between tests. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:29:16 +09:00
Junio C Hamano	a84fd3bcc6	CodingGuidelines: explicitly allow "local" for test scripts `01d3a526` (t0000: check whether the shell supports the "local" keyword, 2017-10-26) raised a test balloon to see if those who build and test Git use a platform with a shell that lacks support for the "local" keyword. After two years, `7f0b5908` (t0000: reword comments for "local" test, 2019-08-08) documented that "local" keyword, even though is outside POSIX, is allowed in our test scripts. Let's write it in the CodingGuidelines, too. It might be tempting to allow it in scripted Porcelains (we have avoided getting them contaminiated by "local" so far), but they are on their way out and getting rewritten in C. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:24:11 +09:00
Josh Soref	ad9322da03	merge: fix swapped "up to date" message components The rewrite of git-merge from shell to C in `1c7b76be7d` (Build in merge, 2008-07-07) accidentally transformed the message: Already up-to-date. (nothing to squash) to: (nothing to squash)Already up-to-date. due to reversed printf() arguments. This problem has gone unnoticed despite being touched over the years by `7f87aff22c` (Teach/Fix pull/fetch -q/-v options, 2008-11-15) and `bacec47845` (i18n: git-merge basic messages, 2011-02-22), and tangentially by `bef4830e88` (i18n: merge: mark messages for translation, 2016-06-17) and `7560f547e6` (treewide: correct several "up-to-date" to "up to date", 2017-08-23). Fix it by restoring the message to its intended order. While at it, help translators out by avoiding "sentence Lego". [es: rewrote commit message] Co-authored-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:14:58 +09:00
Eric Sunshine	80cde95eec	merge(s): apply consistent punctuation to "up to date" messages Although the various "Already up to date" messages resulting from merge attempts share identical phrasing, they use a mix of punctuation ranging from "." to "!" and even "Yeeah!", which leads to extra work for translators. Ease the job of translators by settling upon "." as punctuation for all such messages. While at it, take advantage of printf_ln() to further ease the translation task so translators need not worry about line termination, and fix a case of missing line termination in the (unused) merge_ort_nonrecursive() function. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:14:56 +09:00
Nicholas Clark	62af4bdd42	submodule update: silence underlying fetch with "--quiet" Commands such as $ git submodule update --quiet --init --depth=1 involving shallow clones, call the shell function fetch_in_submodule, which in turn invokes git fetch. Pass the --quiet option onward there. Signed-off-by: Nicholas Clark <nick@ccl4.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 12:24:38 +09:00
Junio C Hamano	7e39198978	The thirteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-30 13:50:27 +09:00
Junio C Hamano	93e0b28dbb	Merge branch 'ab/pathname-encoding-doc' Clarify that pathnames recorded in Git trees are most often (but not necessarily) encoded in UTF-8. * ab/pathname-encoding-doc: doc: clarify the filename encoding in git diff	2021-04-30 13:50:27 +09:00
Junio C Hamano	5980e0d442	Merge branch 'vs/completion-with-set-u' Effort to make the command line completion (in contrib/) safe with "set -u" continues. * vs/completion-with-set-u: completion: avoid aliased command lookup error in nounset mode	2021-04-30 13:50:27 +09:00
Junio C Hamano	bf0d4c8491	Merge branch 'hn/refs-trace-errno' Show errno in the trace output in the error codepath that calls read_raw_ref method. * hn/refs-trace-errno: refs: print errno for read_raw_ref if GIT_TRACE_REFS is set	2021-04-30 13:50:27 +09:00
Junio C Hamano	a1cac26cc6	Merge branch 'mt/parallel-checkout-part-2' The checkout machinery has been taught to perform the actual write-out of the files in parallel when able. * mt/parallel-checkout-part-2: parallel-checkout: add design documentation parallel-checkout: support progress displaying parallel-checkout: add configuration options parallel-checkout: make it truly parallel unpack-trees: add basic support for parallel checkout	2021-04-30 13:50:26 +09:00
Junio C Hamano	59bb0aa93e	Merge branch 'so/log-diff-merge' "git log" learned "--diff-merges=<style>" option, with an associated configuration variable log.diffMerges. * so/log-diff-merge: doc/diff-options: document new --diff-merges features diff-merges: introduce log.diffMerges config variable diff-merges: adapt -m to enable default diff format diff-merges: refactor set_diff_merges() diff-merges: introduce --diff-merges=on	2021-04-30 13:50:26 +09:00
Junio C Hamano	8e97852919	Merge branch 'ds/sparse-index-protections' Builds on top of the sparse-index infrastructure to mark operations that are not ready to mark with the sparse index, causing them to fall back on fully-populated index that they always have worked with. * ds/sparse-index-protections: (47 commits) name-hash: use expand_to_path() sparse-index: expand_to_path() name-hash: don't add directories to name_hash revision: ensure full index resolve-undo: ensure full index read-cache: ensure full index pathspec: ensure full index merge-recursive: ensure full index entry: ensure full index dir: ensure full index update-index: ensure full index stash: ensure full index rm: ensure full index merge-index: ensure full index ls-files: ensure full index grep: ensure full index fsck: ensure full index difftool: ensure full index commit: ensure full index checkout: ensure full index ...	2021-04-30 13:50:26 +09:00
Junio C Hamano	d250f90359	Merge branch 'ds/maintenance-prefetch-fix' The prefetch task in "git maintenance" assumed that "git fetch" from any remote would fetch all its local branches, which would fetch too much if the user is interested in only a subset of branches there. * ds/maintenance-prefetch-fix: maintenance: respect remote.*.skipFetchAll maintenance: use 'git fetch --prefetch' fetch: add --prefetch option maintenance: simplify prefetch logic	2021-04-30 13:50:25 +09:00
Junio C Hamano	a819e2b3ef	Merge branch 'ow/push-quiet-set-upstream' "git push --quiet --set-upstream" was not quiet when setting the upstream branch configuration, which has been corrected. * ow/push-quiet-set-upstream: transport: respect verbosity when setting upstream	2021-04-30 13:50:25 +09:00
Junio C Hamano	279a2e637a	Merge branch 'mt/pkt-write-errors' When packet_write() fails, we gave an extra error message unnecessarily, which has been corrected. * mt/pkt-write-errors: pkt-line: do not report packet write errors twice	2021-04-30 13:50:24 +09:00
Junio C Hamano	13158b9910	Merge branch 'jk/promisor-optim' Handling of "promisor packs" that allows certain objects to be missing and lazily retrievable has been optimized (a bit). * jk/promisor-optim: revision: avoid parsing with --exclude-promisor-objects lookup_unknown_object(): take a repository argument is_promisor_object(): free tree buffer after parsing	2021-04-30 13:50:24 +09:00
Ramsay Jones	4cd66e7d6b	bisect--helper: use BISECT_TERMS in 'bisect skip' command Commit `e4c7b33747` ("bisect--helper: reimplement `bisect_skip` shell function in C", 2021-02-03), as part of the shell-to-C conversion, forgot to read the 'terms' file (.git/BISECT_TERMS) during the new 'bisect skip' command implementation. As a result, the 'bisect skip' command will use the default 'bad'/'good' terms. If the bisection terms have been set to non-default values (for example by the 'bisect start' command), then the 'bisect skip' command will fail. In order to correct this problem, we insert a call to the get_terms() function, which reads the non-default terms from that file (if set), in the '--bisect-skip' command implementation of 'bisect--helper'. Also, add a test[1] to protect against potential future regression. [1] https://lore.kernel.org/git/xmqqim45h585.fsf@gitster.g/T/#m207791568054b0f8cf1a3942878ea36293273c7d Reported-by: Trygve Aaberge <trygveaa@gmail.com> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-30 09:56:42 +09:00
Adam Dinwoodie	bccc37fdc7	cygwin: disallow backslashes in file names The backslash character is not a valid part of a file name on Windows. If, in Windows, Git attempts to write a file that has a backslash character in the filename, it will be incorrectly interpreted as a directory separator. This caused CVE-2019-1354 in MinGW, as this behaviour can be manipulated to cause the checkout to write to files it ought not write to, such as adding code to the .git/hooks directory. This was fixed by `e1d911dd4c` (mingw: disallow backslash characters in tree objects' file names, 2019-09-12). However, the vulnerability also exists in Cygwin: while Cygwin mostly provides a POSIX-like path system, it will still interpret a backslash as a directory separator. To avoid this vulnerability, CVE-2021-29468, extend the previous fix to also apply to Cygwin. Similarly, extend the test case added by the previous version of the commit. The test suite doesn't have an easy way to say "run this test if in MinGW or Cygwin", so add a new test prerequisite that covers both. As well as checking behaviour in the presence of paths containing backslashes, the existing test also checks behaviour in the presence of paths that differ only by the presence of a trailing ".". MinGW follows normal Windows application behaviour and treats them as the same path, but Cygwin more closely emulates *nix systems (at the expense of compatibility with native Windows applications) and will create and distinguish between such paths. Gate the relevant bit of that test accordingly. Reported-by: RyotaK <security@ryotak.me> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-30 09:49:20 +09:00
Patrick Steinhardt	c331551ccf	git: support separate arg for `--config-env`'s value While not documented as such, many of the top-level options like `--git-dir` and `--work-tree` support two syntaxes: they accept both an equals sign between option and its value, and they do support option and value as two separate arguments. The recently added `--config-env` option only supports the syntax with an equals sign. Mitigate this inconsistency by accepting both syntaxes and add tests to verify both work. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-30 09:46:53 +09:00
Patrick Steinhardt	9152904c11	git.txt: fix synopsis of `--config-env` missing the equals sign When executing `git -h`, then the `--config-env` documentation rightly lists the option as requiring an equals between the option and its argument: this is the only currently supported format. But the git(1) manpage incorrectly lists the option as taking a space in between. Fix the issue by adding the missing space. Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-of-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-30 09:46:46 +09:00
Jerry Zhang	526705fd3d	apply: adjust messages to account for --3way changes "git apply" specifically calls out when it is falling back to 3way merge application. Since the order changed to preferring 3way and falling back to direct application, continue that behavior by printing whenever 3way fails and git has to fall back. Signed-off-by: Jerry Zhang <jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-29 12:27:45 +09:00
Jeff King	2ba582ba4c	prune: save reachable-from-recent objects with bitmaps We pass our prune expiration to mark_reachable_objects(), which will traverse not only the reachable objects, but consider any recent ones as tips for reachability; see `d3038d22f9` (prune: keep objects reachable from recent objects, 2014-10-15) for details. However, this interacts badly with the bitmap code path added in `fde67d6896` (prune: use bitmaps for reachability traversal, 2019-02-13). If we hit the bitmap-optimized path, we return immediately to avoid the regular traversal, accidentally skipping the "also traverse recent" code. Instead, we should do an if-else for the bitmap versus regular traversal, and then follow up with the "recent" traversal in either case. This reuses the "rev_info" for a bitmap and then a regular traversal, but that should work OK (the bitmap code clears the pending array in the usual way, just like a regular traversal would). Note that I dropped the comment above the regular traversal here. It has little explanatory value, and makes the if-else logic much harder to read. Here are a few variants that I rejected: - it seems like both the reachability and recent traversals could be done in a single traversal. This was rejected by `d3038d22f9` (prune: keep objects reachable from recent objects, 2014-10-15), though the balance may be different when using bitmaps. However, there's a subtle correctness issue, too: we use revs->ignore_missing_links for the recent traversal, but not the reachability one. - we could try using bitmaps for the recent traversal, too, which could possibly improve performance. But it would require some fixes in the bitmap code, which uses ignore_missing_links for its own purposes. Plus it would probably not help all that much in practice. We use the reachable tips to generate bitmaps, so those objects are likely not covered by bitmaps (unless they just became unreachable). And in general, we expect the set of unreachable objects to be much smaller anyway, so there's less to gain. The test in t5304 detects the bug and confirms the fix. I also beefed up the tests in t6501, which covers the mtime-checking code more thoroughly, to handle the bitmap case (in addition to just "loose" and "packed" cases). Interestingly, this test doesn't actually detect the bug, because it is running "git gc", and not "prune" directly. And "gc" will call "repack" first, which does not suffer the same bug. So the old-but-reachable-from-recent objects get scooped up into the new pack along with the actually-recent objects, which gives both a recent mtime. But it seemed prudent to get more coverage of the bitmap case for related code. Reported-by: David Emett <dave@sp4m.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-29 10:38:25 +09:00
Jeff King	1e951c6473	pack-bitmap: clean up include_check after use When a bitmap walk has to traverse (to fill in non-bitmapped objects), we use rev_info's include_check mechanism to let us stop the traversal early. But after setting the function and its data parameter, we never clean it up. This means that if the rev_info is used for a subsequent traversal without bitmaps, it will unexpectedly call into our include_check function (worse, it will do so pointing to a now-defunct stack variable in include_check_data, likely resulting in a segfault). There's no code which does this now, but it's an accident waiting to happen. Let's clean up after ourselves in the bitmap code. Reported-by: David Emett <dave@sp4m.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-29 10:03:46 +09:00
Luke Shumaker	9a3e3ca2ba	subtree: be stricter about validating flags Don't silently ignore a flag that's invalid for a given subcommand. The user expected it to do something; we should tell the user that they are mistaken, instead of surprising the user. It could be argued that this change might break existing users. I'd argue that those existing users are already broken, and they just don't know it. Let them know that they're broken. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:19 +09:00
Luke Shumaker	49470cd445	subtree: push: allow specifying a local rev other than HEAD 'git subtree split' lets you specify a rev other than HEAD. 'git push' lets you specify a mapping between a local thing and a remot ref. So smash those together, and have 'git subtree push' let you specify which local thing to run split on and push the result of that split to the remote ref. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:19 +09:00
Luke Shumaker	94389e7c81	subtree: allow 'split' flags to be passed to 'push' 'push' does a 'split' internally, but it doesn't pass flags through to the 'split'. This is silly, if you need to pass flags to 'split', then it means that you can't use 'push'! So, have 'push' accept 'split' flags, and pass them through to 'split'. Add tests for this by copying split's tests with minimal modification. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:19 +09:00
Luke Shumaker	cb6551447b	subtree: allow --squash to be used with --rejoin Besides being a genuinely useful thing to do, this also just makes sense and harmonizes which flags may be used when. `git subtree split --rejoin` amounts to "automatically go ahead and do a `git subtree merge` after doing the main `git subtree split`", so it's weird and arbitrary that you can't pass `--squash` to `git subtree split --rejoin` like you can `git subtree merge`. It's weird that `git subtree split --rejoin` inherits `git subtree merge`'s `--message` but not `--squash`. Reconcile the situation by just having `split --rejoin` actually just call `merge` internally (or call `add` instead, as appropriate), so it can get access to the full `merge` behavior, including `--squash`. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:19 +09:00
Luke Shumaker	6468784dd2	subtree: give the docs a once-over Just went through the docs looking for anything inaccurate or that can be improved. In the '-h' text, in the man page synopsis, and in the man page description: Normalize the ordering of the list of sub-commands: 'add', 'merge', 'split', 'pull', 'push'. This allows us to kinda separate the lower-level add/merge/split from the higher-level pull/push. '-h' text: - correction: Indicate that split's arg is optional. - clarity: Emphasize that 'pull' takes the 'add'/'merge' flags. man page: - correction: State that all subcommands take options (it seemed to indicate that only 'split' takes any options other than '-P'). - correction: 'split' only guarantees that the results are identical if the flags are identical. - correction: The flag is named '--ignore-joins', not '--ignore-join'. - completeness: Clarify that 'push' always operates on HEAD, and that 'split' operates on HEAD if no local commit is given. - clarity: In the description, when listing commands, repeat what their arguments are. This way the reader doesn't need to flip back and forth between the command description and the synopsis and the full description to understand what's being said. - clarity: In the <variables> used to give command arguments, give slightly longer, descriptive names. Like <local-commit> instead of just <commit>. - clarity: Emphasize that 'pull' takes the 'add'/'merge' flags. - style: In the synopsis, list options before the subcommand. This makes things line up and be much more readable when shown non-monospace (such as in `make html`), and also more closely matches other man pages (like `git-submodule.txt`). - style: Use the correct syntax for indicating the options ([<options>] instead of [OPTIONS]). - style: In the synopsis, separate 'pull' and 'push' from the other lower-level commands. I think this helps readability. - style: Code-quote things in prose that seem like they should be code-quoted, like '.gitmodules', flags, or full commands. - style: Minor wording improvements, like more consistent mood (many of the command descriptions start in the imperative mood and switch to the indicative mode by the end). That sort of thing. - style: Capitalize "ID". - style: Remove the "This option is only valid for XXX command" remarks from each option, and instead rely on the section headings. - style: Since that line is getting edited anyway, switch "behaviour" to American "behavior". - style: Trim trailing whitespace. `todo`: - style: Trim trailing whitespace. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:19 +09:00
Luke Shumaker	e9525a8a02	subtree: have $indent actually affect indentation Currently, the $indent variable is just used to track how deeply we're nested, and the debug log is indented by things like debug " foo" That is: The indentation-level is hard-coded. It used to be that the code couldn't recurse, so the indentation level could be known statically, so it made sense to just hard-code it in the output. However, since `315a84f9aa` ("subtree: use commits before rejoins for splits", 2018-09-28), it can now recurse, and the debug log is misleading. So fix that. Indent according to $indent. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	534ff90dbd	subtree: don't let debug and progress output clash Currently, debug output (triggered by passing '-d') and progress output stomp on each other. The debug output is just streamed as lines to stderr, and the progress output is sent to stderr as '%s\r'. When writing to a file, it is awkward to read and difficult to distinguish between the debug output and a progress line. When writing to a terminal the debug lines hide progress lines. So, when '-d' has been passed, spit out progress as 'progress: %s\n', instead of as '%s\r', so that it can be detected, and so that the debug lines don't overwrite the progress when written to a terminal. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	5cdae0f6fd	subtree: add comments and sanity checks For each function in subtree, add a usage comment saying what the arguments are, and add an `assert` checking the number of arguments. In figuring out each thing's arguments in order to write those comments and assertions, it turns out that find_existing_splits is written as if it takes multiple 'revs', but it is in fact only ever passed a single 'rev': unrevs="$(find_existing_splits "$dir" "$rev")" \|\| exit $? So go ahead and codify that by documenting and asserting that it takes exactly two arguments, one dir and one rev. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	cbb5de8b83	subtree: remove duplicate check `cmd_add` starts with a check that the directory doesn't yet exist. However, the `main` function performs the exact same check before calling `cmd_add`. So remove the check from `cmd_add`. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	e4f8baa88a	subtree: parse revs in individual cmd_ functions The main argument parser goes ahead and tries to parse revs to make things simpler for the sub-command implementations. But, it includes enough special cases for different sub-commands. And it's difficult having having to think about "is this info coming from an argument, or a global variable?". So the main argument parser's effort to make things "simpler" ends up just making it more confusing and complicated. Begone with the 'revs' global variable; parse 'rev=$(...)' as needed in individual 'cmd_*' functions. Begone with the 'default' global variable. Its would-be value is knowable just from which function we're in. Begone with the 'ensure_single_rev' function. Its functionality can be achieved by passing '--verify' to 'git rev-parse'. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	bbffb02383	subtree: use "^{commit}" instead of "^0" They are synonyms. Both are used in the file. ^{commit} is clearer, so "standardize" on that. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	22d5507493	subtree: don't fuss with PATH Scripts needing to fuss with with adding $(git --exec-prefix) PATH before loading git-sh-setup is a thing of the past. As far as I can tell, it's been a thing of the past since since Git v1.2.0 (2006-02-12), or more specifically, since `77cb17e940` (Exec git programs without using PATH, 2006-01-10). However, it stuck around in contrib scripts and in third-party scripts for long enough that it wasn't unusual to see. Originally `git subtree` didn't fuss with PATH, but when people (including the original subtree author) had problems, because it was a common thing to see, it seemed that having subtree fuss with PATH was a reasonable solution. Here is an abridged history of fussing with PATH in subtree: `2987e6add3` (Add explicit path of git installation by 'git --exec-path', Gianluca Pacchiella, 2009-08-20) As pointed out by documentation, the correct use of 'git-sh-setup' is using $(git --exec-path) to avoid problems with not standard installations. -. git-sh-setup +. $(git --exec-path)/git-sh-setup `33aaa697a2` (Improve patch to use git --exec-path: add to PATH instead, Avery Pennarun, 2009-08-26) If you (like me) are using a modified git straight out of its source directory (ie. without installing), then --exec-path isn't actually correct. Add it to the PATH instead, so if it is correct, it'll work, but if it's not, we fall back to the previous behaviour. -. $(git --exec-path)/git-sh-setup +PATH=$(git --exec-path):$PATH +. git-sh-setup `9c632ea29c` ((Hopefully) fix PATH setting for msysgit, Avery Pennarun, 2010-06-24) Reported by Evan Shaw. The problem is that $(git --exec-path) includes a 'git' binary which is incompatible with the one in /usr/bin; if you run it, it gives you an error about libiconv2.dll. +OPATH=$PATH PATH=$(git --exec-path):$PATH . git-sh-setup +PATH=$OPATH # apparently needed for some versions of msysgit `df2302d774` (Another fix for PATH and msysgit, Avery Pennarun, 2010-06-24) Evan Shaw tells me the previous fix didn't work. Let's use this one instead, which he says does work. This fix is kind of wrong because it will run the "correct" git-sh-setup after the one in /usr/bin, if there is one, which could be weird if you have multiple versions of git installed. But it works on my Linux and his msysgit, so it's obviously better than what we had before. -OPATH=$PATH -PATH=$(git --exec-path):$PATH +PATH=$PATH:$(git --exec-path) . git-sh-setup -PATH=$OPATH # apparently needed for some versions of msysgit First of all, I disagree with Gianluca's reading of the documentation: - I haven't gone back to read what the documentation said in 2009, but in my reading of the 2021 documentation is that it includes "$(git --exec-path)/" in the synopsis for illustrative purposes, not to say it's the proper way. - After being executed by `git`, the git exec path should be the very first entry in PATH, so it shouldn't matter. - None of the scripts that are part of git do it that way. But secondly, the root reason for fussing with PATH seems to be that Avery didn't know that he needs to set GIT_EXEC_PATH if he's going to use git from the source directory without installing. And finally, Evan's issue is clearly just a bug in msysgit. I assume that msysgit has since fixed the issue, and also msysgit has been deprecated for 6 years now, so let's drop the workaround for it. So, remove the line fussing with PATH. However, since subtree is in 'contrib/' and it might get installed in funny ways by users after-the-fact, add a sanity check to the top of the script, checking that it is installed correctly. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	a94f911072	subtree: use "$" instead of "$@" as appropriate "$" is for when you want to concatenate the args together, whitespace-separated; and "$@" is for when you want them to be separate strings. There are several places in subtree that erroneously use $@ when concatenating args together into an error message. For instance, if the args are argv[1]="dead" and argv[2]="beef", then the line die "You must provide exactly one revision. Got: '$@'" surely intends to call 'die' with the argument argv[1]="You must provide exactly one revision. Got: 'dead beef'" however, because the line used $@ instead of $, it will actually call 'die' with the arguments argv[1]="You must provide exactly one revision. Got: 'dead" argv[2]="beef'" This isn't a big deal, because 'die' concatenates its arguments together anyway (using "$"). But that doesn't change the fact that it was a mistake to use $@ instead of $*, even though in the end $@ still ended up doing the right thing. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	e2b11e4211	subtree: use more explicit variable names for cmdline args Make it painfully obvious when reading the code which variables are direct parsings of command line arguments. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	6d43585a68	subtree: use git-sh-setup's `say` subtree currently defines its own `say` implementation, rather than using git-sh-setups's implementation. Change that, don't re-invent the wheel. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:18 +09:00
Luke Shumaker	f664304836	subtree: use `git merge-base --is-ancestor` Instead of writing a slow `rev_is_descendant_of_branch $a $b` function in shell, just use the fast `git merge-base --is-ancestor $b $a`. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:17 +09:00
Luke Shumaker	8dc3240f5f	subtree: drop support for git < 1.7 Suport for Git versions older than 1.7.0 (older than February 2010) was nice to have when git-subtree lived out-of-tree. But now that it lives in git.git, it's not necessary to keep around. While it's technically in contrib, with the standard 'git' packages for common systems (including Arch Linux and macOS) including git-subtree, it seems vanishingly likely to me that people are separately installing git-subtree from git.git alongside an older 'git' install (although it also seems vanishingly likely that people are still using >11 year old git installs). Not that there's much reason to remove it either, it's not much code, and none of my changes depend on a newer git (to my knowledge, anyway; I'm not actually testing against older git). I just figure it's an easy piece of fat to trim, in the journey to making the whole thing easier to hack on. "Ignore space change" is probably helpful when viewing this diff. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:17 +09:00
Luke Shumaker	d2f0f81954	subtree: more consistent error propagation Ensure that every $(subshell) that calls a function (as opposed to an external executable) is followed by `\|\| exit $?`. Similarly, ensure that every `cmd \| while read; do ... done` loop is followed by `\|\| exit $?`. Both of those constructs mean that it can miss `die` calls, and keep running when it shouldn't. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:17 +09:00
Luke Shumaker	5a3569774f	subtree: don't have loose code outside of a function Shove all of the loose code inside of a main() function. This comes down to personal preference more than anything else. A preference that I've developed over years of maintaining large Bash scripts, but still a mere personal preference. In this specific case, it's also moving the `set -- -h`, the `git rev-parse --parseopt`, and the `. git-sh-setup` to be closer to all the rest of the argument parsing, which is a readability win on its own, IMO. "Ignore space change" is probably helpful when viewing this diff. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:17 +09:00
Luke Shumaker	b04538d99f	subtree: t7900: add porcelain tests for 'pull' and 'push' The 'pull' and 'push' subcommands deserve their own sections in the tests. Add some basic tests for them. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:17 +09:00
Luke Shumaker	b269976979	subtree: t7900: add a test for the -h flag It's a dumb test, but it's surprisingly easy to break. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:17 +09:00
Luke Shumaker	db6952b2b2	subtree: t7900: rename last_commit_message to last_commit_subject t7900-subtree.sh defines a helper function named last_commit_message. However, it only returns the subject line of the commit message, not the entire commit message. So rename it, to make the name less confusing. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:17 +09:00
Luke Shumaker	f1cd2d93c2	subtree: t7900: fix 'verify one file change per commit' As far as I can tell, this test isn't actually testing anything, because someone forgot to tack on `--name-only` to `git log`. This seems to have been the case since the test was first written, back in `fa16ab36ad` ("test.sh: make sure no commit changes more than one file at a time.", 2009-04-26), unless `git log` used to do that by default and didn't need the flag back then? Convincing myself that it's not actually testing anything was tricky, the code is a little hard to reason about. It can be made a lot simpler if instead of trying to parse all of the info from a single `git log`, we're OK calling `git log` from inside of a loop. And it's my opinion that tests are not the place for clever optimized code. So, fix and simplify the test, so that it's actually testing something and is simpler to reason about. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:17 +09:00
Luke Shumaker	63ac4f1ade	subtree: t7900: delete some dead code Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:17 +09:00
Luke Shumaker	c4566ab429	subtree: t7900: use 'test' for string equality t7900-subtree.sh defines its own `check_equal A B` function, instead of just using `test A = B` like all of the other tests. Don't be special, get rid of `check_equal` in favor of `test`. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:16 +09:00
Luke Shumaker	40b1e1ec58	subtree: t7900: comment subtree_test_create_repo It's unclear what the purpose of t7900-subtree.sh's `subtree_test_create_repo` helper function is. It wraps test-lib.sh's, `test_create_repo` but follows that up by setting log.date=relative. Why does it set log.date=relative? My first guess was that at one point the tests required that, but no longer do, and that the function is now vestigial. I even wrote a patch to get rid of it and was moments away from `git send-email`ing it. However, by chance when looking for something else in the history, I discovered the true reason, from `e7aac44ed2` (contrib/subtree: ignore log.date configuration, 2015-07-21). It's testing that setting log.date=relative doesn't break `git subtree`, as at one point in the past that did break `git subtree`. So, add a comment about this, to avoid future such confusion. And while at it, go ahead and (1) touch up the function to avoid a pointless subshell and (2) update the one test that didn't use it. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:16 +09:00
Luke Shumaker	f700406957	subtree: t7900: use consistent formatting The formatting in t7900-subtree.sh isn't even consistent throughout the file. Fix that; make it consistent throughout the file. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:16 +09:00
Luke Shumaker	f2bb7fef7a	subtree: t7900: use test-lib.sh's test_count Use test-lib.sh's `test_count`, instead instead of having t7900-subtree.sh do its own book-keeping with `subtree_test_count` that has to be explicitly incremented by calling `next_test`. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:16 +09:00
Luke Shumaker	914d512551	subtree: t7900: update for having the default branch name be 'main' Most of the tests had been converted to support `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`, but `contrib/subtree/t/` hadn't. Convert it. Most of the mentions of 'master' can just be replaced with 'HEAD'. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:47:16 +09:00
Luke Shumaker	4c996deb4a	.gitignore: ignore 'git-subtree' as a build artifact Running `make -C contrib/subtree/ test` creates a `git-subtree` executable in the root of the repo. Add it to the .gitignore so that anyone hacking on subtree won't have to deal with that noise. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 16:46:30 +09:00
Rafael Silva	a643157d5a	repack: avoid loosening promisor objects in partial clones When `git repack -A -d` is run in a partial clone, `pack-objects` is invoked twice: once to repack all promisor objects, and once to repack all non-promisor objects. The latter `pack-objects` invocation is with --exclude-promisor-objects and --unpack-unreachable, which loosens all objects unused during this invocation. Unfortunately, this includes promisor objects. Because the -d argument to `git repack` subsequently deletes all loose objects also in packs, these just-loosened promisor objects will be immediately deleted. However, this extra disk churn is unnecessary in the first place. For example, in a newly-cloned partial repo that filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up unpacking all trees and commits into the filesystem because every object, in this particular case, is a promisor object. Depending on the repo size, this increases the disk usage considerably: In my copy of the linux.git, the object directory peaked 26GB of more disk usage. In order to avoid this extra disk churn, pass the names of the promisor packfiles as --keep-pack arguments to the second invocation of `pack-objects`. This informs `pack-objects` that the promisor objects are already in a safe packfile and, therefore, do not need to be loosened. For testing, we need to validate whether any object was loosened. However, the "evidence" (loosened objects) is deleted during the process which prevents us from inspecting the object directory. Instead, let's teach `pack-objects` to count loosened objects and emit via trace2 thus allowing inspecting the debug events after the process is finished. This new event is used on the added regression test. Lastly, add a new perf test to evaluate the performance impact made by this changes (tested on git.git): Test HEAD^ HEAD ---------------------------------------------------------- 5600.3: gc 134.38(41.93+90.95) 7.80(6.72+1.35) -94.2% For a bigger repository, such as linux.git, the improvement is even bigger: Test HEAD^ HEAD ------------------------------------------------------------------- 5600.3: gc 6833.00(918.07+3162.74) 268.79(227.02+39.18) -96.1% These improvements are particular big because every object in the newly-cloned partial repository is a promisor object. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 13:36:13 +09:00
Peter Oliver	7a14acdbe6	doc: point to diff attribute in patch format docs From the documentation for generating patch text with diff-related commands, refer to the documentation for the diff attribute. This attribute influences the way that patches are generated, but this was previously not mentioned in e.g., the git-diff manpage. Signed-off-by: Peter Oliver <git@mavit.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 13:34:44 +09:00
Andrzej Hunt	37be11994f	builtin/rm: avoid leaking pathspec and seen parse_pathspec() populates pathspec, hence we need to clear it once it's no longer needed. seen is xcalloc'd within the same function and likewise needs to be freed once its no longer needed. cmd_rm() has multiple early returns, therefore we need to clear or free as soon as this data is no longer needed, as opposed to doing a cleanup at the end. LSAN output from t0020: Direct leak of 112 byte(s) in 1 object(s) allocated from: #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9ac0a4 in do_xmalloc wrapper.c:41:8 #2 0x9ac07a in xmalloc wrapper.c:62:9 #3 0x873277 in parse_pathspec pathspec.c:582:2 #4 0x646ffa in cmd_rm builtin/rm.c:266:2 #5 0x4cd91d in run_builtin git.c:467:11 #6 0x4cb5f3 in handle_builtin git.c:719:3 #7 0x4ccf47 in run_argv git.c:808:4 #8 0x4caf49 in cmd_main git.c:939:19 #9 0x69dc0e in main common-main.c:52:11 #10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349) Indirect leak of 65 byte(s) in 1 object(s) allocated from: #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9ac2a6 in xrealloc wrapper.c:126:8 #2 0x93b14d in strbuf_grow strbuf.c:98:2 #3 0x93ccf6 in strbuf_vaddf strbuf.c:392:3 #4 0x93f726 in xstrvfmt strbuf.c:979:2 #5 0x93f8b3 in xstrfmt strbuf.c:989:8 #6 0x92ad8a in prefix_path_gently setup.c:115:15 #7 0x873a8d in init_pathspec_item pathspec.c:439:11 #8 0x87334f in parse_pathspec pathspec.c:589:3 #9 0x646ffa in cmd_rm builtin/rm.c:266:2 #10 0x4cd91d in run_builtin git.c:467:11 #11 0x4cb5f3 in handle_builtin git.c:719:3 #12 0x4ccf47 in run_argv git.c:808:4 #13 0x4caf49 in cmd_main git.c:939:19 #14 0x69dc0e in main common-main.c:52:11 #15 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349) Indirect leak of 15 byte(s) in 1 object(s) allocated from: #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3 #1 0x9ac048 in xstrdup wrapper.c:29:14 #2 0x873ba2 in init_pathspec_item pathspec.c:468:20 #3 0x87334f in parse_pathspec pathspec.c:589:3 #4 0x646ffa in cmd_rm builtin/rm.c:266:2 #5 0x4cd91d in run_builtin git.c:467:11 #6 0x4cb5f3 in handle_builtin git.c:719:3 #7 0x4ccf47 in run_argv git.c:808:4 #8 0x4caf49 in cmd_main git.c:939:19 #9 0x69dc0e in main common-main.c:52:11 #10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 1 byte(s) in 1 object(s) allocated from: #0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0x9ac392 in xcalloc wrapper.c:140:8 #2 0x647108 in cmd_rm builtin/rm.c:294:9 #3 0x4cd91d in run_builtin git.c:467:11 #4 0x4cb5f3 in handle_builtin git.c:719:3 #5 0x4ccf47 in run_argv git.c:808:4 #6 0x4caf49 in cmd_main git.c:939:19 #7 0x69dbfe in main common-main.c:52:11 #8 0x7f4fac1b0349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:45 +09:00
Andrzej Hunt	805b789a69	builtin/rebase: release git_format_patch_opt too options.git_format_patch_opt can be populated during cmd_rebase's setup, and will therefore leak on return. Although we could just UNLEAK all of options, we choose to strbuf_release() the individual member, which matches the existing pattern (where we're freeing invidual members of options). Leak found when running t0021: Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9ac296 in xrealloc wrapper.c:126:8 #2 0x93b13d in strbuf_grow strbuf.c:98:2 #3 0x93bd3a in strbuf_add strbuf.c:295:2 #4 0x60ae92 in strbuf_addstr strbuf.h:304:2 #5 0x605f17 in cmd_rebase builtin/rebase.c:1759:3 #6 0x4cd91d in run_builtin git.c:467:11 #7 0x4cb5f3 in handle_builtin git.c:719:3 #8 0x4ccf47 in run_argv git.c:808:4 #9 0x4caf49 in cmd_main git.c:939:19 #10 0x69dbfe in main common-main.c:52:11 #11 0x7f66dae91349 in __libc_start_main (/lib64/libc.so.6+0x24349) SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s). Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:45 +09:00
Andrzej Hunt	a317a553b8	builtin/for-each-ref: free filter and UNLEAK sorting. sorting might be a list allocated in ref_default_sorting() (in this case it's a fixed single item list, which has nevertheless been xcalloc'd), or it might be a list allocated in parse_opt_ref_sorting(). In either case we could free these lists - but instead we UNLEAK as we're at the end of cmd_for_each_ref. (There's no existing implementation of clear_ref_sorting(), and writing a loop to free the list seems more trouble than it's worth.) filter.with_commit/no_commit are populated via OPT_CONTAINS/OPT_NO_CONTAINS, both of which create new entries via parse_opt_commits(), and also need to be free'd or UNLEAK'd. Because free_commit_list() already exists, we choose to use that over an UNLEAK. LSAN output from t0041: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0x9ac252 in xcalloc wrapper.c:140:8 #2 0x8a4a55 in ref_default_sorting ref-filter.c:2486:32 #3 0x56c6b1 in cmd_for_each_ref builtin/for-each-ref.c:72:13 #4 0x4cd91d in run_builtin git.c:467:11 #5 0x4cb5f3 in handle_builtin git.c:719:3 #6 0x4ccf47 in run_argv git.c:808:4 #7 0x4caf49 in cmd_main git.c:939:19 #8 0x69dabe in main common-main.c:52:11 #9 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9abf54 in do_xmalloc wrapper.c:41:8 #2 0x9abf2a in xmalloc wrapper.c:62:9 #3 0x717486 in commit_list_insert commit.c:540:33 #4 0x8644cf in parse_opt_commits parse-options-cb.c:98:2 #5 0x869bb5 in get_value parse-options.c:181:11 #6 0x8677dc in parse_long_opt parse-options.c:378:10 #7 0x8659bd in parse_options_step parse-options.c:817:11 #8 0x867fcd in parse_options parse-options.c:870:10 #9 0x56c62b in cmd_for_each_ref builtin/for-each-ref.c:59:2 #10 0x4cd91d in run_builtin git.c:467:11 #11 0x4cb5f3 in handle_builtin git.c:719:3 #12 0x4ccf47 in run_argv git.c:808:4 #13 0x4caf49 in cmd_main git.c:939:19 #14 0x69dabe in main common-main.c:52:11 #15 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:45 +09:00
Andrzej Hunt	f3a9680791	mailinfo: also free strbuf lists when clearing mailinfo mailinfo.p_hdr_info/s_hdr_info are null-terminated lists of strbuf's, with entries pointing either to NULL or an allocated strbuf. Therefore we need to free those strbuf's (and not just the data they contain) whenever we're done with a given entry. (See handle_header() where those new strbufs are malloc'd.) Once we no longer need the list (and not just its entries) we can switch over to strbuf_list_free() instead of manually iterating over the list, which takes care of those additional details for us. We can only do this in clear_mailinfo() - in handle_commit_message() we are only clearing the array contents but want to reuse the array itself, hence we can't use strbuf_list_free() there. However, strbuf_list_free() cannot handle a NULL input, and the lists we are freeing might be NULL. Therefore we add a NULL check in strbuf_list_free() to make it safe to use with a NULL input (which is a pattern used by some of the other *_free() functions around git). Leak output from t0023: Direct leak of 72 byte(s) in 3 object(s) allocated from: #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9ac9f4 in do_xmalloc wrapper.c:41:8 #2 0x9ac9ca in xmalloc wrapper.c:62:9 #3 0x7f6cf7 in handle_header mailinfo.c:205:10 #4 0x7f5abf in check_header mailinfo.c:583:4 #5 0x7f5524 in mailinfo mailinfo.c:1197:3 #6 0x4dcc95 in parse_mail builtin/am.c:1167:6 #7 0x4d9070 in am_run builtin/am.c:1732:12 #8 0x4d5b7a in cmd_am builtin/am.c:2398:3 #9 0x4cd91d in run_builtin git.c:467:11 #10 0x4cb5f3 in handle_builtin git.c:719:3 #11 0x4ccf47 in run_argv git.c:808:4 #12 0x4caf49 in cmd_main git.c:939:19 #13 0x69e43e in main common-main.c:52:11 #14 0x7fc1fadfa349 in __libc_start_main (/lib64/libc.so.6+0x24349) SUMMARY: AddressSanitizer: 72 byte(s) leaked in 3 allocation(s). Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:45 +09:00
Andrzej Hunt	52a9436aa7	builtin/checkout: clear pending objects after diffing add_pending_object() populates rev.pending, we need to take care of clearing it once we're done. This code is run close to the end of a checkout, therefore this leak seems like it would have very little impact. See also LSAN output from t0020 below: Direct leak of 2048 byte(s) in 1 object(s) allocated from: #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9acc46 in xrealloc wrapper.c:126:8 #2 0x83e3a3 in add_object_array_with_path object.c:337:3 #3 0x8f672a in add_pending_object_with_path revision.c:329:2 #4 0x8eaeab in add_pending_object_with_mode revision.c:336:2 #5 0x8eae9d in add_pending_object revision.c:342:2 #6 0x5154a0 in show_local_changes builtin/checkout.c:602:2 #7 0x513b00 in merge_working_tree builtin/checkout.c:979:3 #8 0x512cb3 in switch_branches builtin/checkout.c:1242:9 #9 0x50f8de in checkout_branch builtin/checkout.c:1646:9 #10 0x50ba12 in checkout_main builtin/checkout.c:2003:9 #11 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8 #12 0x4cd91d in run_builtin git.c:467:11 #13 0x4cb5f3 in handle_builtin git.c:719:3 #14 0x4ccf47 in run_argv git.c:808:4 #15 0x4caf49 in cmd_main git.c:939:19 #16 0x69e43e in main common-main.c:52:11 #17 0x7f5dd1d50349 in __libc_start_main (/lib64/libc.so.6+0x24349) SUMMARY: AddressSanitizer: 2048 byte(s) leaked in 1 allocation(s). Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:45 +09:00
Andrzej Hunt	265644367f	builtin/check-ignore: clear_pathspec before returning parse_pathspec() allocates new memory into pathspec, therefore we need to free it when we're done. An UNLEAK would probably be just as good here - but clear_pathspec() is not much more work so we might as well use it. check_ignore() is either called once directly from cmd_check_ignore() (in which case the leak really doesnt matter), or it can be called multiple times in a loop from check_ignore_stdin_paths(), in which case we're potentially leaking multiple times - but even in this scenario the leak is so small as to have no real consequence. Found while running t0008: Direct leak of 112 byte(s) in 1 object(s) allocated from: #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9aca44 in do_xmalloc wrapper.c:41:8 #2 0x9aca1a in xmalloc wrapper.c:62:9 #3 0x873c17 in parse_pathspec pathspec.c:582:2 #4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2 #5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17 #6 0x4cd91d in run_builtin git.c:467:11 #7 0x4cb5f3 in handle_builtin git.c:719:3 #8 0x4ccf47 in run_argv git.c:808:4 #9 0x4caf49 in cmd_main git.c:939:19 #10 0x69e43e in main common-main.c:52:11 #11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349) Indirect leak of 65 byte(s) in 1 object(s) allocated from: #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9acc46 in xrealloc wrapper.c:126:8 #2 0x93baed in strbuf_grow strbuf.c:98:2 #3 0x93d696 in strbuf_vaddf strbuf.c:392:3 #4 0x9400c6 in xstrvfmt strbuf.c:979:2 #5 0x940253 in xstrfmt strbuf.c:989:8 #6 0x92b72a in prefix_path_gently setup.c:115:15 #7 0x87442d in init_pathspec_item pathspec.c:439:11 #8 0x873cef in parse_pathspec pathspec.c:589:3 #9 0x503eb8 in check_ignore builtin/check-ignore.c:90:2 #10 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17 #11 0x4cd91d in run_builtin git.c:467:11 #12 0x4cb5f3 in handle_builtin git.c:719:3 #13 0x4ccf47 in run_argv git.c:808:4 #14 0x4caf49 in cmd_main git.c:939:19 #15 0x69e43e in main common-main.c:52:11 #16 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349) Indirect leak of 2 byte(s) in 1 object(s) allocated from: #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3 #1 0x9ac9e8 in xstrdup wrapper.c:29:14 #2 0x874542 in init_pathspec_item pathspec.c:468:20 #3 0x873cef in parse_pathspec pathspec.c:589:3 #4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2 #5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17 #6 0x4cd91d in run_builtin git.c:467:11 #7 0x4cb5f3 in handle_builtin git.c:719:3 #8 0x4ccf47 in run_argv git.c:808:4 #9 0x4caf49 in cmd_main git.c:939:19 #10 0x69e43e in main common-main.c:52:11 #11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349) SUMMARY: AddressSanitizer: 179 byte(s) leaked in 3 allocation(s). Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:45 +09:00
Andrzej Hunt	4fa268738c	builtin/bugreport: don't leak prefixed filename prefix_filename() returns newly allocated memory, and strbuf_addstr() doesn't take ownership of its inputs. Therefore we have to make sure to store and free prefix_filename()'s result. As this leak is in cmd_bugreport(), we could just as well UNLEAK the prefix - but there's no good reason not to just free it properly. This leak was found while running t0091, see output below: Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x49ab79 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9acc66 in xrealloc wrapper.c:126:8 #2 0x93baed in strbuf_grow strbuf.c:98:2 #3 0x93c6ea in strbuf_add strbuf.c:295:2 #4 0x69f162 in strbuf_addstr ./strbuf.h:304:2 #5 0x69f083 in prefix_filename abspath.c:277:2 #6 0x4fb275 in cmd_bugreport builtin/bugreport.c:146:9 #7 0x4cd91d in run_builtin git.c:467:11 #8 0x4cb5f3 in handle_builtin git.c:719:3 #9 0x4ccf47 in run_argv git.c:808:4 #10 0x4caf49 in cmd_main git.c:939:19 #11 0x69df9e in main common-main.c:52:11 #12 0x7f523a987349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:45 +09:00
Andrzej Hunt	d895804b5a	branch: FREE_AND_NULL instead of NULL'ing real_ref real_ref was previously populated by dwim_ref(), which allocates new memory. We need to make sure to free real_ref when discarding it. (real_ref is already being freed at the end of create_branch() - but if we discard it early then it will leak.) This fixes the following leak found while running t0002-t0099: Direct leak of 5 byte(s) in 1 object(s) allocated from: #0 0x486954 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3 #1 0xdd6484 in xstrdup wrapper.c:29:14 #2 0xc0f658 in expand_ref refs.c:671:12 #3 0xc0ecf1 in repo_dwim_ref refs.c:644:22 #4 0x8b1184 in dwim_ref ./refs.h:162:9 #5 0x8b0b02 in create_branch branch.c:284:10 #6 0x550cbb in update_refs_for_switch builtin/checkout.c:1046:4 #7 0x54e275 in switch_branches builtin/checkout.c:1274:2 #8 0x548828 in checkout_branch builtin/checkout.c:1668:9 #9 0x541306 in checkout_main builtin/checkout.c:2025:9 #10 0x5395fa in cmd_checkout builtin/checkout.c:2077:8 #11 0x4d02a8 in run_builtin git.c:467:11 #12 0x4cbfe9 in handle_builtin git.c:719:3 #13 0x4cf04f in run_argv git.c:808:4 #14 0x4cb85a in cmd_main git.c:939:19 #15 0x820cf6 in main common-main.c:52:11 #16 0x7f30bd9dd349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:45 +09:00
Andrzej Hunt	b180c681bb	bloom: clear each bloom_key after use fill_bloom_key() allocates memory into bloom_key, we need to clean that up once the key is no longer needed. This leak was found while running t0002-t0099. Although this leak is happening in code being called from a test-helper, the same code is also used in various locations around git, and can therefore happen during normal usage too. Gabor's analysis shows that peak-memory usage during 'git commit-graph write' is reduced on the order of 10% for a selection of larger repos (along with an even larger reduction if we override modified path bloom filter limits): https://lore.kernel.org/git/20210411072651.GF2947267@szeder.dev/ LSAN output: Direct leak of 308 byte(s) in 11 object(s) allocated from: #0 0x49a5e2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0x6f4032 in xcalloc wrapper.c:140:8 #2 0x4f2905 in fill_bloom_key bloom.c:137:28 #3 0x4f34c1 in get_or_compute_bloom_filter bloom.c:284:4 #4 0x4cb484 in get_bloom_filter_for_commit t/helper/test-bloom.c:43:11 #5 0x4cb072 in cmd__bloom t/helper/test-bloom.c:97:3 #6 0x4ca7ef in cmd_main t/helper/test-tool.c:121:11 #7 0x4caace in main common-main.c:52:11 #8 0x7f798af95349 in __libc_start_main (/lib64/libc.so.6+0x24349) SUMMARY: AddressSanitizer: 308 byte(s) leaked in 11 allocation(s). Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:44 +09:00
Andrzej Hunt	4c217a4c34	ls-files: free max_prefix when done common_prefix() returns a new string, which we store in max_prefix - this string needs to be freed to avoid a leak. This leak is happening in cmd_ls_files, hence is of no real consequence - an UNLEAK would be just as good, but we might as well free the string properly. Leak found while running t0002, see output below: Direct leak of 8 byte(s) in 1 object(s) allocated from: #0 0x49a85d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9ab1b4 in do_xmalloc wrapper.c:41:8 #2 0x9ab248 in do_xmallocz wrapper.c:75:8 #3 0x9ab22a in xmallocz wrapper.c:83:9 #4 0x9ab2d7 in xmemdupz wrapper.c:99:16 #5 0x78d6a4 in common_prefix dir.c:191:15 #6 0x5aca48 in cmd_ls_files builtin/ls-files.c:669:16 #7 0x4cd92d in run_builtin git.c:453:11 #8 0x4cb5fa in handle_builtin git.c:704:3 #9 0x4ccf57 in run_argv git.c:771:4 #10 0x4caf49 in cmd_main git.c:902:19 #11 0x69ce2e in main common-main.c:52:11 #12 0x7f64d4d94349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:44 +09:00
Andrzej Hunt	5493ce7af9	wt-status: fix multiple small leaks rev.prune_data is populated (in multiple functions) via copy_pathspec, and therefore needs to be cleared after running the diff in those functions. rev(_info).pending is populated indirectly via setup_revisions, and also needs to be cleared once diffing is done. These leaks were found while running t0008 or t0021. The rev.prune_data leaks are small (80B) but noisy, hence I won't bother including their logs - the rev.pending leaks are bigger, and can happen early in the course of other commands, and therefore possibly more valuable to fix - see example log from a rebase below: Direct leak of 2048 byte(s) in 1 object(s) allocated from: #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9ac2a6 in xrealloc wrapper.c:126:8 #2 0x83da03 in add_object_array_with_path object.c:337:3 #3 0x8f5d8a in add_pending_object_with_path revision.c:329:2 #4 0x8ea50b in add_pending_object_with_mode revision.c:336:2 #5 0x8ea4fd in add_pending_object revision.c:342:2 #6 0x8ea610 in add_head_to_pending revision.c:354:2 #7 0x9b55f5 in has_uncommitted_changes wt-status.c:2474:2 #8 0x9b58c4 in require_clean_work_tree wt-status.c:2553:6 #9 0x606bcc in cmd_rebase builtin/rebase.c:1970:6 #10 0x4cd91d in run_builtin git.c:467:11 #11 0x4cb5f3 in handle_builtin git.c:719:3 #12 0x4ccf47 in run_argv git.c:808:4 #13 0x4caf49 in cmd_main git.c:939:19 #14 0x69dc0e in main common-main.c:52:11 #15 0x7f2d18909349 in __libc_start_main (/lib64/libc.so.6+0x24349) Indirect leak of 5 byte(s) in 1 object(s) allocated from: #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3 #1 0x9ac048 in xstrdup wrapper.c:29:14 #2 0x83da8d in add_object_array_with_path object.c:349:17 #3 0x8f5d8a in add_pending_object_with_path revision.c:329:2 #4 0x8ea50b in add_pending_object_with_mode revision.c:336:2 #5 0x8ea4fd in add_pending_object revision.c:342:2 #6 0x8ea610 in add_head_to_pending revision.c:354:2 #7 0x9b55f5 in has_uncommitted_changes wt-status.c:2474:2 #8 0x9b58c4 in require_clean_work_tree wt-status.c:2553:6 #9 0x606bcc in cmd_rebase builtin/rebase.c:1970:6 #10 0x4cd91d in run_builtin git.c:467:11 #11 0x4cb5f3 in handle_builtin git.c:719:3 #12 0x4ccf47 in run_argv git.c:808:4 #13 0x4caf49 in cmd_main git.c:939:19 #14 0x69dc0e in main common-main.c:52:11 #15 0x7f2d18909349 in __libc_start_main (/lib64/libc.so.6+0x24349) SUMMARY: AddressSanitizer: 2053 byte(s) leaked in 2 allocation(s). Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:44 +09:00
Andrzej Hunt	db69bf608d	revision: free remainder of old commit list in limit_list limit_list() iterates over the original revs->commits list, and consumes many of its entries via pop_commit. However we might stop iterating over the list early (e.g. if we realise that the rest of the list is uninteresting). If we do stop iterating early, list will be pointing to the unconsumed portion of revs->commits - and we need to free this list to avoid a leak. (revs->commits itself will be an invalid pointer: it will have been free'd during the first pop_commit.) However the list pointer is later reused to iterate over our new list, but only for the limiting_can_increase_treesame() branch. We therefore need to introduce a new variable for that branch - and while we're here we can rename the original list to original_list as that makes its purpose more obvious. This leak was found while running t0090. It's not likely to be very impactful, but it can happen quite early during some checkout invocations, and hence seems to be worth fixing: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9ac084 in do_xmalloc wrapper.c:41:8 #2 0x9ac05a in xmalloc wrapper.c:62:9 #3 0x7175d6 in commit_list_insert commit.c:540:33 #4 0x71800f in commit_list_insert_by_date commit.c:604:9 #5 0x8f8d2e in process_parents revision.c:1128:5 #6 0x8f2f2c in limit_list revision.c:1418:7 #7 0x8f210e in prepare_revision_walk revision.c:3577:7 #8 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6 #9 0x512f05 in switch_branches builtin/checkout.c:1250:3 #10 0x50f8de in checkout_branch builtin/checkout.c:1646:9 #11 0x50ba12 in checkout_main builtin/checkout.c:2003:9 #12 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8 #13 0x4cd91d in run_builtin git.c:467:11 #14 0x4cb5f3 in handle_builtin git.c:719:3 #15 0x4ccf47 in run_argv git.c:808:4 #16 0x4caf49 in cmd_main git.c:939:19 #17 0x69dc0e in main common-main.c:52:11 #18 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349) Indirect leak of 48 byte(s) in 3 object(s) allocated from: #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9ac084 in do_xmalloc wrapper.c:41:8 #2 0x9ac05a in xmalloc wrapper.c:62:9 #3 0x717de6 in commit_list_append commit.c:1609:35 #4 0x8f1f9b in prepare_revision_walk revision.c:3554:12 #5 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6 #6 0x512f05 in switch_branches builtin/checkout.c:1250:3 #7 0x50f8de in checkout_branch builtin/checkout.c:1646:9 #8 0x50ba12 in checkout_main builtin/checkout.c:2003:9 #9 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8 #10 0x4cd91d in run_builtin git.c:467:11 #11 0x4cb5f3 in handle_builtin git.c:719:3 #12 0x4ccf47 in run_argv git.c:808:4 #13 0x4caf49 in cmd_main git.c:939:19 #14 0x69dc0e in main common-main.c:52:11 #15 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 09:25:44 +09:00
brian m. carlson	3dd71461e2	hex: print objects using the hash algorithm member Now that all code paths correctly set the hash algorithm member of struct object_id, write an object's hex representation using the hash algorithm member embedded in it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
brian m. carlson	b8505ecbf2	hex: default to the_hash_algo on zero algorithm value There are numerous places in the codebase where we assume we can initialize data by zeroing all its bytes. However, when we do that with a struct object_id, it leaves the structure with a zero value for the algorithm, which is invalid. We could forbid this pattern and require that all struct object_id instances be initialized using oidclr, but this seems burdensome and it's unnatural to most C programmers. Instead, if the algorithm is zero, assume we wanted to use the default hash algorithm instead. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
brian m. carlson	71b7672b67	builtin/pack-objects: avoid using struct object_id for pack hash We use struct object_id for the names of objects. It isn't intended to be used for other hash values that don't name objects such as the pack hash. Because struct object_id will soon need to have its algorithm member set, using it in this code path would mean that we didn't set that member, only the hash member, which would result in a crash. For both of these reasons, switch to using an unsigned char array of size GIT_MAX_RAWSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
brian m. carlson	72871b132c	commit-graph: don't store file hashes as struct object_id The idea behind struct object_id is that it is supposed to represent the identifier of a standard Git object or a special pseudo-object like the all-zeros object ID. In this case, we have file hashes, which, while similar, are distinct from the identifiers of objects. Switch these code paths to use an unsigned char array. This is both more logically consistent and it means that we need not set the algorithm identifier for the struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
brian m. carlson	dd15f4f457	builtin/show-index: set the algorithm for object IDs In most cases, when we load the hash of an object into a struct object_id, we load it using one of the oid* or *_oid_hex functions. However, for git show-index, we read it in directly using fread. As a consequence, set the algorithm correctly so the objects can be used correctly both now and in the future. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
brian m. carlson	14228447c9	hash: provide per-algorithm null OIDs Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
brian m. carlson	5a6dce70d7	hash: set, copy, and use algo field in struct object_id Now that struct object_id has an algorithm field, we should populate it. This will allow us to handle object IDs in any supported algorithm and distinguish between them. Ensure that the field is written whenever we write an object ID by storing it explicitly every time we write an object. Set values for the empty blob and tree values as well. In addition, use the algorithm field to compare object IDs. Note that because we zero-initialize struct object_id in many places throughout the codebase, we default to the default algorithm in cases where the algorithm field is zero rather than explicitly initialize all of those locations. This leads to a branch on every comparison, but the alternative is to compare the entire buffer each time and padding the buffer for SHA-1. That alternative ranges up to 3.9% worse than this approach on the perf t0001, t1450, and t1451. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:38 +09:00
brian m. carlson	0e5e2284f1	builtin/pack-redundant: avoid casting buffers to struct object_id Now that we need our instances of struct object_id to be zero padded, we can no longer cast unsigned char buffers to be pointers to struct object_id. This file reads data out of the pack objects and then inserts it directly into a linked list item which is a pointer to struct object_id. Instead, let's have the linked list item hold its own struct object_id and copy the data into it. In addition, since these are not really pointers to struct object_id, stop passing them around as such, and call them what they really are: pointers to unsigned char. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:38 +09:00
brian m. carlson	5951bf467e	Use the final_oid_fn to finalize hashing of object IDs When we're hashing a value which is going to be an object ID, we want to zero-pad that value if necessary. To do so, use the final_oid_fn instead of the final_fn anytime we're going to create an object ID to ensure we perform this operation. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:38 +09:00
brian m. carlson	ab795f0d77	hash: add a function to finalize object IDs To avoid the penalty of having to branch in hash comparison functions, we'll want to always compare the full hash member in a struct object_id, which will require that SHA-1 object IDs be zero-padded. To do so, add a function which finalizes a hash context and writes it into an object ID that performs this padding. Move the definition of struct object_id and the constant definitions higher up so we they are available for us to use. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:38 +09:00
brian m. carlson	c3b4e4ee36	http-push: set algorithm when reading object ID In most places in the codebase, we use oidread to properly read an object ID into a struct object_id. However, in the HTTP code, we end up needing to parse a loose object path with a slash in it, so we can't do that. Let's instead explicitly set the algorithm in this function so we can rely on it in the future. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:38 +09:00
brian m. carlson	92e2cab96b	Always use oidread to read into struct object_id In the future, we'll want oidread to automatically set the hash algorithm member for an object ID we read into it, so ensure we use oidread instead of hashcpy everywhere we're copying a hash value into a struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:38 +09:00
brian m. carlson	cf0983213c	hash: add an algo member to struct object_id Now that we're working with multiple hash algorithms in the same repo, it's best if we label each object ID with its algorithm so we can determine how to format a given object ID. Add a member called algo to struct object_id. Performance testing on object ID-heavy workloads doesn't reveal a clear change in performance. Out of performance tests t0001 and t1450, there are slight variations in performance both up and down, but all measurements are within the margin of error. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:38 +09:00
ZheNing Hu	b722d4560e	pretty: provide human date format Add the placeholders %ah and %ch to format author date and committer date, like --date=human does, which provides more humanity date output. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:09:32 +09:00
Ævar Arnfjörð Bjarmason	3593ebd3f5	pretty tests: give --date/format tests a better description Change the description for the --date/format equivalency tests added in `466fb6742d` (pretty: provide a strict ISO 8601 date format, 2014-08-29) and `0df621172d` (pretty: provide short date format, 2019-11-19) to be more meaningful. This allows us to reword the comment added in the former commit to refer to both tests, and any other future test, such as the in-flight --date=human format being proposed in [1]. 1. http://lore.kernel.org/git/pull.939.v2.git.1619275340051.gitgitgadget@gmail.com Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:08:54 +09:00
Ævar Arnfjörð Bjarmason	fbfcaec8d8	pretty tests: simplify %aI/%cI date format test Change a needlessly complex test for the %aI/%cI date formats (iso-strict) added in `466fb6742d` (pretty: provide a strict ISO 8601 date format, 2014-08-29) to instead use the same pattern used to test %as/%cs since `0df621172d` (pretty: provide short date format, 2019-11-19). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:05:56 +09:00
Han-Wen Nienhuys	34c319970d	refs/debug: trace into reflog expiry too Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 15:59:39 +09:00
Denton Liu	7cdb096903	git-completion.bash: consolidate cases in _git_stash() The $subcommand case statement in _git_stash() is quite repetitive. Consolidate the cases together into one catch-all case to reduce the repetition. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 15:41:07 +09:00
Denton Liu	59d85a2a05	git-completion.bash: use $__git_cmd_idx in more places With the introduction of the $__git_cmd_idx variable in `e94fb44042` (git-completion.bash: pass $__git_subcommand_idx from __git_main(), 2021-03-24), completion functions were able to know the index at which the git command is listed, allowing them to skip options that are given to the underlying git itself, not the corresponding command (e.g. `-C asdf` in `git -C asdf branch`). While most of the changes here are self-explanatory, some bear further explanation. For the __git_find_on_cmdline() and __git_find_last_on_cmdline() pair of functions, these functions are only ever called in the context of a git command completion function. These functions will only care about words after the command so we can safely ignore the words before this. For _git_worktree(), this change is technically a no-op (once the __git_find_last_on_cmdline change is also applied). It was in poor style to have hard-coded on the index right after `worktree`. In case `git worktree` were to ever learn to accept options, the current situation would be inflexible. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 15:41:07 +09:00
Denton Liu	87e629756f	git-completion.bash: rename to $__git_cmd_idx In `e94fb44042` (git-completion.bash: pass $__git_subcommand_idx from __git_main(), 2021-03-24), the $__git_subcommand_idx variable was introduced. Naming it after the index of the subcommand is needlessly confusing as, when this variable is used, it is in the completion functions for commands (e.g. _git_remote()) where for `git remote add`, the `remote` is referred to as the command and `add` is referred to as the subcommand. Rename this variable so that it's obvious it's about git commands. While we're at it, shorten up its name so that it's still readable without being a handful to type. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 15:41:07 +09:00
Patrick Steinhardt	482d549906	t1300: fix unset of GIT_CONFIG_NOSYSTEM leaking into subsequent tests In order to test whether the new GIT_CONFIG_SYSTEM environment variable behaves as expected, we unset GIT_CONFIG_NOSYSTEM in one of our tests in t1300. But because tests are not executed in a subshell, this unset leaks into all subsequent tests and may thus cause them to fail in some environments. These failures are easily reproducable with `make prefix=/root test`. Fix the issue by not using `sane_unset GIT_CONFIG_NOSYSTEM`, but instead just manually add it to the environment of the two command invocations which need it. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 15:15:34 +09:00
Bruno Albuquerque	a2ba162cda	object-info: support for retrieving object info Sometimes it is useful to get information of an object without having to download it completely. Add the "object-info" capability that lets the client ask for object-related information with their full hexadecimal object names. Only sizes are returned for now. Signed-off-by: Bruno Albuquerque <bga@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-20 17:41:13 -07:00
Junio C Hamano	311531c9de	The twelfth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-20 17:23:37 -07:00
Junio C Hamano	4090b6973b	Merge branch 'js/access-nul-emulation-on-windows' Portability fix. * js/access-nul-emulation-on-windows: msvc: avoid calling `access("NUL", flags)`	2021-04-20 17:23:37 -07:00
Junio C Hamano	b9fa3ba0ca	Merge branch 'sg/bugreport-fixes' The dependencies for config-list.h and command-list.h were broken when the former was split out of the latter, which has been corrected. * sg/bugreport-fixes: Makefile: add missing dependencies of 'config-list.h'	2021-04-20 17:23:37 -07:00
Junio C Hamano	092bf77e8c	Merge branch 'jc/doc-do-not-capitalize-clarification' Doc update for developers. * jc/doc-do-not-capitalize-clarification: doc: clarify "do not capitalize the first word" rule	2021-04-20 17:23:36 -07:00
Junio C Hamano	fdef940afe	Merge branch 'ab/usage-error-docs' Documentation updates, with unrelated comment updates, too. * ab/usage-error-docs: api docs: document that BUG() emits a trace2 error event api docs: document BUG() in api-error-handling.txt usage.c: don't copy/paste the same comment three times	2021-04-20 17:23:36 -07:00
Junio C Hamano	522010b573	Merge branch 'ab/detox-gettext-tests' Test clean-up. * ab/detox-gettext-tests: tests: remove all uses of test_i18cmp	2021-04-20 17:23:36 -07:00
Junio C Hamano	e02f75c9eb	Merge branch 'jt/fetch-pack-request-fix' * jt/fetch-pack-request-fix: fetch-pack: buffer object-format with other args	2021-04-20 17:23:36 -07:00
Junio C Hamano	196cc525e2	Merge branch 'hn/reftable-tables-doc-update' Doc updte. * hn/reftable-tables-doc-update: reftable: document an alternate cleanup method on Windows	2021-04-20 17:23:35 -07:00
Junio C Hamano	2eebac2c49	Merge branch 'jk/pack-objects-bitmap-progress-fix' When "git pack-objects" makes a literal copy of a part of existing packfile using the reachability bitmaps, its update to the progress meter was broken. * jk/pack-objects-bitmap-progress-fix: pack-objects: update "nr_seen" progress based on pack-reused count	2021-04-20 17:23:35 -07:00
Junio C Hamano	ab99efc817	Merge branch 'ab/userdiff-tests' A bit of code clean-up and a lot of test clean-up around userdiff area. * ab/userdiff-tests: blame tests: simplify userdiff driver test blame tests: don't rely on t/t4018/ directory userdiff: remove support for "broken" tests userdiff tests: list builtin drivers via test-tool userdiff tests: explicitly test "default" pattern userdiff: add and use for_each_userdiff_driver() userdiff style: normalize pascal regex declaration userdiff style: declare patterns with consistent style userdiff style: re-order drivers in alphabetical order	2021-04-20 17:23:34 -07:00
Junio C Hamano	6d7a62d74d	Merge branch 'ar/userdiff-scheme' Userdiff patterns for "Scheme" has been added. * ar/userdiff-scheme: userdiff: add support for Scheme	2021-04-20 17:23:34 -07:00
Denton Liu	8c8c8c0e16	git-completion.bash: separate some commands onto their own line In `e94fb44042` (git-completion.bash: pass $__git_subcommand_idx from __git_main(), 2021-03-24), a line was introduced which contained multiple statements. This is difficult to read so break it into multiple lines. While we're at it, follow this convention for the rest of the __git_main() and break up lines that contain multiple statements. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-20 13:27:35 -07:00
Andrey Bienkowski	9364bf465d	doc: clarify the filename encoding in git diff AFAICT parsing the output of `git diff --name-only master...feature` is the intended way of programmatically getting the list of files modified by a feature branch. It is impossible to parse text unless you know what encoding it is in. The output encoding of diff --name-only and Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-20 12:57:26 -07:00
ZheNing Hu	844c3f0b0b	ref-filter: reuse output buffer When we use `git for-each-ref`, every ref will allocate its own output strbuf and error strbuf. But we can reuse the final strbuf for each step ref's output. The error buffer will also be reused, despite the fact that the git will exit when `format_ref_array_item()` return a non-zero value and output the contents of the error buffer. The performance for `git for-each-ref` on the Git repository itself with performance testing tool `hyperfine` changes from 23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms. Optimization is relatively minor. At the same time, we apply this optimization to `git tag -l` and `git branch -l`. This approach is similar to the one used by `79ed0a5` (cat-file: use a single strbuf for all output, 2018-08-14) to speed up the cat-file builtin. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jeff King <peff@peff.net> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-20 11:09:50 -07:00
ZheNing Hu	22f69a85ed	ref-filter: get rid of show_ref_array_item Inlining the exported function `show_ref_array_item()`, which is not providing the right level of abstraction, simplifies the API and can unlock improvements at the former call sites. Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 15:08:00 -07:00
Matheus Tavares	68e66f2987	parallel-checkout: add design documentation Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 15:05:25 -07:00
Patrick Steinhardt	4179b4897f	config: allow overriding of global and system configuration In order to have git run in a fully controlled environment without any misconfiguration, it may be desirable for users or scripts to override global- and system-level configuration files. We already have a way of doing this, which is to unset both HOME and XDG_CONFIG_HOME environment variables and to set `GIT_CONFIG_NOGLOBAL=true`. This is quite kludgy, and unsetting the first two variables likely has an impact on other executables spawned by such a script. The obvious way to fix this would be to introduce `GIT_CONFIG_NOGLOBAL` as an equivalent to `GIT_CONFIG_NOSYSTEM`. But in the past, it has turned out that this design is inflexible: we cannot test system-level parsing of the git configuration in our test harness because there is no way to change its location, so all tests run with `GIT_CONFIG_NOSYSTEM` set. Instead of doing the same mistake with `GIT_CONFIG_NOGLOBAL`, introduce two new variables `GIT_CONFIG_GLOBAL` and `GIT_CONFIG_SYSTEM`: - If unset, git continues to use the usual locations. - If set to a specific path, we skip reading the normal configuration files and instead take the path. By setting the path to `/dev/null`, no configuration will be loaded for the respective level. This implements the usecase where we want to execute code in a sanitized environment without any potential misconfigurations via `/dev/null`, but is more flexible and allows for more usecases than simply adding `GIT_CONFIG_NOGLOBAL`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 14:16:59 -07:00
Patrick Steinhardt	1e06eb9b5d	config: unify code paths to get global config paths There's two callsites which assemble global config paths, once in the config loading code and once in the git-config(1) builtin. We're about to implement a way to override global config paths via an environment variable which would require us to adjust both sites. Unify both code paths into a single `git_global_config()` function which returns both paths for `~/.gitconfig` and the XDG config file. This will make the subsequent patch which introduces the new envvar easier to implement. No functional changes are expected from this patch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 14:16:59 -07:00
Patrick Steinhardt	c62a999c6e	config: rename `git_etc_config()` The `git_etc_gitconfig()` function retrieves the system-level path of the configuration file. We're about to introduce a way to override it via an environment variable, at which point the name of this function would start to become misleading. Rename the function to `git_system_config()` as a preparatory step. While at it, the function is also refactored to pass memory ownership to the caller. This is done to better match semantics of `git_global_config()`, which is going to be introduced in the next commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 14:16:59 -07:00
Patrick Steinhardt	9cf68b27d5	rev-list: allow filtering of provided items When providing an object filter, it is currently impossible to also filter provided items. E.g. when executing `git rev-list HEAD` , the commit this reference points to will be treated as user-provided and is thus excluded from the filtering mechanism. This makes it harder than necessary to properly use the new `--filter=object:type` filter given that even if the user wants to only see blobs, he'll still see commits of provided references. Improve this by introducing a new `--filter-provided-objects` option to the git-rev-parse(1) command. If given, then all user-provided references will be subject to filtering. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 14:09:11 -07:00
Patrick Steinhardt	169a15ebd6	pack-bitmap: implement combined filter When the user has multiple objects filters specified, then this is internally represented by having a "combined" filter. These combined filters aren't yet supported by bitmap indices and can thus not be accelerated. Fix this by implementing support for these combined filters. The implementation is quite trivial: when there's a combined filter, we simply recurse into `filter_bitmap()` for all of the sub-filters. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 14:09:11 -07:00
Patrick Steinhardt	7ab6aafa58	pack-bitmap: implement object type filter The preceding commit has added a new object filter for git-rev-list(1) which allows to filter objects by type. Implement the equivalent filter for packfile bitmaps so that we can answer these queries fast. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 14:09:11 -07:00
Patrick Steinhardt	b0c42a53c9	list-objects: implement object type filter While it already is possible to filter objects by some criteria in git-rev-list(1), it is not yet possible to filter out only a specific type of objects. This makes some filters less useful. The `blob:limit` filter for example filters blobs such that only those which are smaller than the given limit are returned. But it is unfit to ask only for these smallish blobs, given that git-rev-list(1) will continue to print tags, commits and trees. Now that we have the infrastructure in place to also filter tags and commits, we can improve this situation by implementing a new filter which selects objects based on their type. Above query can thus trivially be implemented with the following command: $ git rev-list --objects --filter=object:type=blob \ --filter=blob:limit=200 Furthermore, this filter allows to optimize for certain other cases: if for example only tags or commits have been selected, there is no need to walk down trees. The new filter is not yet supported in bitmaps. This is going to be implemented in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 14:09:11 -07:00
Matheus Tavares	1c4d6f46be	parallel-checkout: support progress displaying Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 11:57:05 -07:00
Matheus Tavares	7531e4b66e	parallel-checkout: add configuration options Make parallel checkout configurable by introducing two new settings: checkout.workers and checkout.thresholdForParallelism. The first defines the number of workers (where one means sequential checkout), and the second defines the minimum number of entries to attempt parallel checkout. To decide the default value for checkout.workers, the parallel version was benchmarked during three operations in the linux repo, with cold cache: cloning v5.8, checking out v5.8 from v2.6.15 (checkout I) and checking out v5.8 from v5.7 (checkout II). The four tables below show the mean run times and standard deviations for 5 runs in: a local file system on SSD, a local file system on HDD, a Linux NFS server, and Amazon EFS (all on Linux). Each parallel checkout test was executed with the number of workers that brings the best overall results in that environment. Local SSD: Sequential 10 workers Speedup Clone 8.805 s ± 0.043 s 3.564 s ± 0.041 s 2.47 ± 0.03 Checkout I 9.678 s ± 0.057 s 4.486 s ± 0.050 s 2.16 ± 0.03 Checkout II 5.034 s ± 0.072 s 3.021 s ± 0.038 s 1.67 ± 0.03 Local HDD: Sequential 10 workers Speedup Clone 32.288 s ± 0.580 s 30.724 s ± 0.522 s 1.05 ± 0.03 Checkout I 54.172 s ± 7.119 s 54.429 s ± 6.738 s 1.00 ± 0.18 Checkout II 40.465 s ± 2.402 s 38.682 s ± 1.365 s 1.05 ± 0.07 Linux NFS server (v4.1, on EBS, single availability zone): Sequential 32 workers Speedup Clone 240.368 s ± 6.347 s 57.349 s ± 0.870 s 4.19 ± 0.13 Checkout I 242.862 s ± 2.215 s 58.700 s ± 0.904 s 4.14 ± 0.07 Checkout II 65.751 s ± 1.577 s 23.820 s ± 0.407 s 2.76 ± 0.08 EFS (v4.1, replicated over multiple availability zones): Sequential 32 workers Speedup Clone 922.321 s ± 2.274 s 210.453 s ± 3.412 s 4.38 ± 0.07 Checkout I 1011.300 s ± 7.346 s 297.828 s ± 0.964 s 3.40 ± 0.03 Checkout II 294.104 s ± 1.836 s 126.017 s ± 1.190 s 2.33 ± 0.03 The above benchmarks show that parallel checkout is most effective on repositories located on an SSD or over a distributed file system. For local file systems on spinning disks, and/or older machines, the parallelism does not always bring a good performance. For this reason, the default value for checkout.workers is one, a.k.a. sequential checkout. To decide the default value for checkout.thresholdForParallelism, another benchmark was executed in the "Local SSD" setup, where parallel checkout showed to be beneficial. This time, we compared the runtime of a `git checkout -f`, with and without parallelism, after randomly removing an increasing number of files from the Linux working tree. The "sequential fallback" column below corresponds to the executions where checkout.workers was 10 but checkout.thresholdForParallelism was equal to the number of to-be-updated files plus one (so that we end up writing sequentially). Each test case was sampled 15 times, and each sample had a randomly different set of files removed. Here are the results: sequential fallback 10 workers speedup 10 files 772.3 ms ± 12.6 ms 769.0 ms ± 13.6 ms 1.00 ± 0.02 20 files 780.5 ms ± 15.8 ms 775.2 ms ± 9.2 ms 1.01 ± 0.02 50 files 806.2 ms ± 13.8 ms 767.4 ms ± 8.5 ms 1.05 ± 0.02 100 files 833.7 ms ± 21.4 ms 750.5 ms ± 16.8 ms 1.11 ± 0.04 200 files 897.6 ms ± 30.9 ms 730.5 ms ± 14.7 ms 1.23 ± 0.05 500 files 1035.4 ms ± 48.0 ms 677.1 ms ± 22.3 ms 1.53 ± 0.09 1000 files 1244.6 ms ± 35.6 ms 654.0 ms ± 38.3 ms 1.90 ± 0.12 2000 files 1488.8 ms ± 53.4 ms 658.8 ms ± 23.8 ms 2.26 ± 0.12 From the above numbers, 100 files seems to be a reasonable default value for the threshold setting. Note: Up to 1000 files, we observe a drop in the execution time of the parallel code with an increase in the number of files. This is a rather odd behavior, but it was observed in multiple repetitions. Above 1000 files, the execution time increases according to the number of files, as one would expect. About the test environments: Local SSD tests were executed on an i7-7700HQ (4 cores with hyper-threading) running Manjaro Linux. Local HDD tests were executed on an Intel(R) Xeon(R) E3-1230 (also 4 cores with hyper-threading), HDD Seagate Barracuda 7200.14 SATA 3.1, running Debian. NFS and EFS tests were executed on an Amazon EC2 c5n.xlarge instance, with 4 vCPUs. The Linux NFS server was running on a m6g.large instance with 2 vCPUSs and a 1 TB EBS GP2 volume. Before each timing, the linux repository was removed (or checked out back to its previous state), and `sync && sysctl vm.drop_caches=3` was executed. Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 11:57:05 -07:00
Matheus Tavares	e9e8adf1a8	parallel-checkout: make it truly parallel Use multiple worker processes to distribute the queued entries and call write_pc_item() in parallel for them. The items are distributed uniformly in contiguous chunks. This minimizes the chances of two workers writing to the same directory simultaneously, which could affect performance due to lock contention in the kernel. Work stealing (or any other format of re-distribution) is not implemented yet. The protocol between the main process and the workers is quite simple. They exchange binary messages packed in pkt-line format, and use PKT-FLUSH to mark the end of input (from both sides). The main process starts the communication by sending N pkt-lines, each corresponding to an item that needs to be written. These packets contain all the necessary information to load, smudge, and write the blob associated with each item. Then it waits for the worker to send back N pkt-lines containing the results for each item. The resulting packet must contain: the identification number of the item that it refers to, the status of the operation, and the lstat() data gathered after writing the file (iff the operation was successful). For now, checkout always uses a hardcoded value of 2 workers, only to demonstrate that the parallel checkout framework correctly divides and writes the queued entries. The next patch will add user configurations and define a more reasonable default, based on tests with the said settings. Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 11:57:05 -07:00
Matheus Tavares	04155bdad8	unpack-trees: add basic support for parallel checkout This new interface allows us to enqueue some of the entries being checked out to later uncompress them, apply in-process filters, and write out the files in parallel. For now, the parallel checkout machinery is enabled by default and there is no user configuration, but run_parallel_checkout() just writes the queued entries in sequence (without spawning additional workers). The next patch will actually implement the parallelism and, later, we will make it configurable. Note that, to avoid potential data races, not all entries are eligible for parallel checkout. Also, paths that collide on disk (e.g. case-sensitive paths in case-insensitive file systems), are detected by the parallel checkout code and skipped, so that they can be safely sequentially handled later. The collision detection works like the following: - If the collision was at basename (e.g. 'a/b' and 'a/B'), the framework detects it by looking for EEXIST and EISDIR errors after an open(O_CREAT \| O_EXCL) failure. - If the collision was at dirname (e.g. 'a/b' and 'A'), it is detected at the has_dirs_only_path() check, which is done for the leading path of each item in the parallel checkout queue. Both verifications rely on the fact that, before enqueueing an entry for parallel checkout, checkout_entry() makes sure that there is no file at the entry's path and that its leading components are all real directories. So, any later change in these conditions indicates that there was a collision (either between two parallel-eligible entries or between an eligible and an ineligible one). After all parallel-eligible entries have been processed, the collided (and thus, skipped) entries are sequentially fed to checkout_entry() again. This is similar to the way the current code deals with collisions, overwriting the previously checked out entries with the subsequent ones. The only difference is that, since we no longer create the files in the same order that they appear on index, we are not able to determine which of the colliding entries will survive on disk (for the classic code, it is always the last entry). Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 11:57:05 -07:00
Sergey Organov	364bc11fe5	doc/diff-options: document new --diff-merges features Document changes in -m and --diff-merges=m semantics, as well as new --diff-merges=on option. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 23:38:35 -07:00
Sergey Organov	17c13e60fd	diff-merges: introduce log.diffMerges config variable New log.diffMerges configuration variable sets the format that --diff-merges=on will be using. The default is "separate". t4013: add the following tests for log.diffMerges config: * Test that wrong values are denied. * Test that the value of log.diffMerges properly affects both --diff-merges=on and -m. t9902: fix completion tests for log.d* to match log.diffMerges. Added documentation for log.diffMerges. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 23:38:35 -07:00
Sergey Organov	38fc4dbbc2	diff-merges: adapt -m to enable default diff format Let -m option (and --diff-merges=m) enable the default format instead of "separate", to be able to tune it with log.diffMerges option. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 23:38:35 -07:00
Sergey Organov	26a0f58da8	diff-merges: refactor set_diff_merges() Split set_diff_merges() into separate parsing and execution functions, the former to be reused for parsing of configuration values later in the patch series. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 23:38:35 -07:00
Sergey Organov	4320815eb9	diff-merges: introduce --diff-merges=on Introduce the notion of default diff format for merges, and the option "on" to select it. The default format is "separate" and can't yet be changed, so effectively "on" is just a synonym for "separate" for now. Add corresponding test to t4013. This is in preparation for introducing log.diffMerges configuration option that will let --diff-merges=on to be configured to any supported format. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 23:38:35 -07:00
Junio C Hamano	b0c09ab879	The eleventh (aka "ort") batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 13:53:34 -07:00
Junio C Hamano	257ae76ba9	Merge branch 'ah/merge-ort-ubsan-fix' Code clean-up for merge-ort backend. * ah/merge-ort-ubsan-fix: merge-ort: only do pointer arithmetic for non-empty lists	2021-04-16 13:53:34 -07:00
Junio C Hamano	7bec8e7fa6	Merge branch 'en/ort-readiness' Plug the ort merge backend throughout the rest of the system, and start testing it as a replacement for the recursive backend. * en/ort-readiness: Add testing with merge-ort merge strategy t6423: mark remaining expected failure under merge-ort as such Revert "merge-ort: ignore the directory rename split conflict for now" merge-recursive: add a bunch of FIXME comments documenting known bugs merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict t: mark several submodule merging tests as fixed under merge-ort merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries t6428: new test for SKIP_WORKTREE handling and conflicts merge-ort: support subtree shifting merge-ort: let renormalization change modify/delete into clean delete merge-ort: have ll_merge() use a special attr_index for renormalization merge-ort: add a special minimal index just for renormalization merge-ort: use STABLE_QSORT instead of QSORT where required	2021-04-16 13:53:34 -07:00
Junio C Hamano	e2e1a03f6b	Merge branch 'en/ort-perf-batch-10' Various rename detection optimization to help "ort" merge strategy backend. * en/ort-perf-batch-10: diffcore-rename: determine which relevant_sources are no longer relevant merge-ort: record the reason that we want a rename for a file diffcore-rename: add computation of number of unknown renames diffcore-rename: check if we have enough renames for directories early on diffcore-rename: only compute dir_rename_count for relevant directories merge-ort: record the reason that we want a rename for a directory merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type diffcore-rename: take advantage of "majority rules" to skip more renames	2021-04-16 13:53:33 -07:00
Ville Skyttä	76655e8a28	completion: avoid aliased command lookup error in nounset mode Aliased command lookup accesses the `list` variable before it has been set, causing an error in "nounset" mode. Initialize to an empty string to avoid that. $ git nonexistent-command <Tab>bash: list: unbound variable Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 13:40:52 -07:00
Derrick Stolee	32f67888d8	maintenance: respect remote.*.skipFetchAll If a remote has the skipFetchAll setting enabled, then that remote is not intended for frequent fetching. It makes sense to not fetch that data during the 'prefetch' maintenance task. Skip that remote in the iteration without error. The skip_default_update member is initialized in remote.c:handle_config() as part of initializing the 'struct remote'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 13:36:55 -07:00
Derrick Stolee	cfd781ea22	maintenance: use 'git fetch --prefetch' The 'prefetch' maintenance task previously forced the following refspec for each remote: +refs/heads/:refs/prefetch/<remote>/ If a user has specified a more strict refspec for the remote, then this prefetch task downloads more objects than necessary. The previous change introduced the '--prefetch' option to 'git fetch' which manipulates the remote's refspec to place all resulting refs into refs/prefetch/, with further partitioning based on the destinations of those refspecs. Update the documentation to be more generic about the destination refs. Do not mention custom refspecs explicitly, as that does not need to be highlighted in this documentation. The important part of placing refs in refs/prefetch/ remains. Reported-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 13:36:55 -07:00
Derrick Stolee	2e03115d0c	fetch: add --prefetch option The --prefetch option will be used by the 'prefetch' maintenance task instead of sending refspecs explicitly across the command-line. The intention is to modify the refspec to place all results in refs/prefetch/ instead of anywhere else. Create helper method filter_prefetch_refspec() to modify a given refspec to fit the rules expected of the prefetch task: * Negative refspecs are preserved. * Refspecs without a destination are removed. * Refspecs whose source starts with "refs/tags/" are removed. * Other refspecs are placed within "refs/prefetch/". Finally, we add the 'force' option to ensure that prefetch refs are replaced as necessary. There are some interesting cases that are worth testing. An earlier version of this change dropped the "i--" from the loop that deletes a refspec item and shifts the remaining entries down. This allowed some refspecs to not be modified. The subtle part about the first --prefetch test is that the "refs/tags/" refspec appears directly before the "refs/heads/bogus/" refspec. Without that "i--", this ordering would remove the "refs/tags/" refspec and leave the last one unmodified, placing the result in "refs/heads/". It is possible to have an empty refspec. This is typically the case for remotes other than the origin, where users want to fetch a specific tag or branch. To correctly test this case, we need to further remove the upstream remote for the local branch. Thus, we are testing a refspec that will be deleted, leaving nothing to fetch. Helped-by: Tom Saeger <tom.saeger@oracle.com> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 13:36:55 -07:00
Johannes Schindelin	9160068ac6	msvc: avoid calling `access("NUL", flags)` Apparently this is not supported with Microsoft's Universal C Runtime. So let's not actually do that. Instead, just return success because we _know_ that we expect the `NUL` device to be present. Side note: it is possible to turn off the "Null device driver" and thereby disable `NUL`. Too many things are broken if this driver is disabled, therefore it is not worth bothering to try to detect its presence when `access()` is called. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-16 12:05:32 -07:00
Matheus Tavares	332ec963bc	pkt-line: do not report packet write errors twice On write() errors, packet_write() dies with the same error message that is already printed by its callee, packet_write_gently(). This produces an unnecessarily verbose and repetitive output: error: packet write failed fatal: packet write failed: <strerror() message> In addition to that, packet_write_gently() does not always fulfill its caller expectation that errno will be properly set before a non-zero return. In particular, that is not the case for a "data exceeds max packet size" error. So, in this case, packet_write() will call die_errno() and print an strerror(errno) message that might be totally unrelated to the actual error. Fix both those issues by turning packet_write() and packet_write_gently() into wrappers to a common lower level function that doesn't print the error message, but instead returns it on a buffer for the caller to die() or error() as appropriate. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-15 15:05:31 -07:00
Junio C Hamano	d1b10fc6d8	The tenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-15 13:36:01 -07:00
Junio C Hamano	5a7e52bed2	Merge branch 'jz/apply-3way-cached' "git apply" now takes "--3way" and "--cached" at the same time, and work and record results only in the index. * jz/apply-3way-cached: git-apply: allow simultaneous --cached and --3way options	2021-04-15 13:36:01 -07:00
Junio C Hamano	b98db1dd70	Merge branch 'ab/complete-cherry-pick-head' The command line completion (in contrib/) has learned that CHERRY_PICK_HEAD is a possible pseudo-ref. * ab/complete-cherry-pick-head: bash completion: complete CHERRY_PICK_HEAD	2021-04-15 13:36:01 -07:00
Junio C Hamano	771c758e8a	Merge branch 'jz/apply-run-3way-first' "git apply --3way" has always been "to fall back to 3-way merge only when straight application fails". Swap the order of falling back so that 3-way is always attempted first (only when the option is given, of course) and then straight patch application is used as a fallback when it fails. * jz/apply-run-3way-first: git-apply: try threeway first when "--3way" is used	2021-04-15 13:36:00 -07:00
Øystein Walle	f3cce896a8	transport: respect verbosity when setting upstream A command such as `git push -qu origin feature` will print "Branch 'feature' set up to track remote branch 'feature' from 'origin'." even when --quiet is passed. In this case it's because install_branch_config() is always called with BRANCH_CONFIG_VERBOSE. struct transport keeps track of the desired verbosity. Fix the above issue by passing BRANCH_CONFIG_VERBOSE conditionally based on that. Signed-off-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-15 12:52:49 -07:00
Junio C Hamano	151b6c2dd7	doc: clarify "do not capitalize the first word" rule The same "do not capitalize the first word" rule is applied to both our patch titles and error messages, but the existing description was fuzzy in two aspects. * For error messages, it was not said that this was only about the first word that begins the sentence. * For both, it was not clear when a capital letter there was not an error. We avoid capitalizing the first word when the only reason you would capitalize it is because it happens to be the first word in the sentence. If a proper noun, which is usually spelled in capital letters, happens to come at the beginning of the sentence, it should be kept in capital letters. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 23:41:00 -07:00
Derrick Stolee	4589bca829	name-hash: use expand_to_path() A sparse-index loads the name-hash data for its entries, including the sparse-directory entries. If a caller asks for a path that is contained within a sparse-directory entry, we need to expand to a full index and recalculate the name hash table before returning the result. Insert calls to expand_to_path() to protect against this case. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:48:01 -07:00
Derrick Stolee	71f82d032f	sparse-index: expand_to_path() Some users of the index API have a specific path they are looking for, but choose to use index_file_exists() to rely on the name-hash hashtable instead of doing binary search with index_name_pos(). These users only need to know a yes/no answer, not a position within the cache array. When the index is sparse, the name-hash hash table does not contain the full list of paths within sparse directories. It _does_ contain the directory names for the sparse-directory entries. Create a helper function, expand_to_path(), for intended use with the name-hash hashtable functions. The integration with name-hash.c will follow in a later change. The solution here is to use ensure_full_index() when we determine that the requested path is within a sparse directory entry. This will populate the name-hash hashtable as the index is recomputed from scratch. There may be cases where the caller is trying to find an untracked path that is not in the index but also is not within a sparse directory entry. We want to minimize the overhead for these requests. If we used index_name_pos() to find the insertion order of the path, then we could determine from that position if a sparse-directory exists. (In fact, just calling index_name_pos() in that case would lead to expanding the index to a full index.) However, this takes O(log N) time where N is the number of cache entries. To keep the performance of this call based mostly on the input string, use index_file_exists() to look for the ancestors of the path. Using the heuristic that a sparse directory is likely to have a small number of parent directories, we start from the bottom and build up. Use a string buffer to allow mutating the path name to terminate after each slash for each hashset test. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:54 -07:00
Derrick Stolee	5f11669586	name-hash: don't add directories to name_hash Sparse directory entries represent a directory that is outside the sparse-checkout definition. These are not paths to blobs, so should not be added to the name_hash table. Instead, they should be added to the directory hashtable when 'ignore_case' is true. Add a condition to avoid placing sparse directories into the name_hash hashtable. This avoids filling the table with extra entries that will never be queried. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:51 -07:00
Derrick Stolee	f5fed74fb2	revision: ensure full index Before iterating over all index entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. This case could be integrated later by ensuring that we walk the tree in the sparse-directory entry, but the current behavior is only expecting blobs. Save this integration for later when it can be properly tested. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:48 -07:00
Derrick Stolee	dc26b23ebc	resolve-undo: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:45 -07:00
Derrick Stolee	0c18c059a1	read-cache: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:42 -07:00
Derrick Stolee	465a04abc6	pathspec: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:40 -07:00
Derrick Stolee	f7ef64be0c	merge-recursive: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:37 -07:00
Derrick Stolee	3450a304aa	entry: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:35 -07:00
Derrick Stolee	d425f65127	dir: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:32 -07:00
Derrick Stolee	2508df0272	update-index: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:29 -07:00
Derrick Stolee	a02912019a	stash: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:26 -07:00
Derrick Stolee	e43e2a17d2	rm: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:24 -07:00
Derrick Stolee	299e2c4561	merge-index: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full one to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:21 -07:00
Derrick Stolee	42f44e84eb	ls-files: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full one to avoid missing files. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:17 -07:00
Derrick Stolee	46eb6e31ef	grep: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full one so we do not miss blobs to scan. Later, this can integrate more carefully with sparse indexes with proper testing. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:13 -07:00
Derrick Stolee	2227ea175f	fsck: ensure full index When verifying all blobs reachable from the index, ensure that a sparse index has been expanded to a full one to avoid missing some blobs. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:11 -07:00
Derrick Stolee	48b3c7da6c	difftool: ensure full index Before iterating over all cache entries, ensure that a sparse index has been expanded to a full one to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:09 -07:00
Derrick Stolee	cb8388df5b	commit: ensure full index These two loops iterate over all cache entries, so ensure that a sparse index is expanded to a full index before we do so. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:06 -07:00
Derrick Stolee	0f6d3ba6bd	checkout: ensure full index Before iterating over all cache entries in the checkout builtin, ensure that we have a full index to avoid any unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:03 -07:00
Derrick Stolee	1b850d37f4	checkout-index: ensure full index Before we iterate over all cache entries, ensure that the index is not sparse. This loop in checkout_all() might be safe to iterate over a sparse index, but let's put this protection here until it can be carefully tested. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:59 -07:00
Derrick Stolee	54beed24d2	add: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:48 -07:00
Derrick Stolee	118a2e8bde	cache: move ensure_full_index() to cache.h Soon we will insert ensure_full_index() calls across the codebase. Instead of also adding include statements for sparse-index.h, let's just use the fact that anything that cares about the index already has cache.h in its includes. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:41 -07:00
Derrick Stolee	95e0321c4d	read-cache: expand on query into sparse-directory entry Callers to index_name_pos() or index_name_stage_pos() have a specific path in mind. If that happens to be a path with an ancestor being a sparse-directory entry, it can lead to unexpected results. In the case that we did not find the requested path, check to see if the position _before_ the inserted position is a sparse directory entry that matches the initial segment of the input path (including the directory separator at the end of the directory name). If so, then expand the index to be a full index and search again. This expansion will only happen once per index read. Future enhancements could be more careful to expand only the necessary sparse directory entry, but then we would have a special "not fully sparse, but also not fully expanded" mode that could affect writing the index to file. Since this only occurs if a specific file is requested outside of the sparse checkout definition, this is unlikely to be a common situation. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:30 -07:00
Derrick Stolee	847a9e5d4f	*: remove 'const' qualifier for struct index_state Several methods specify that they take a 'struct index_state' pointer with the 'const' qualifier because they intend to only query the data, not change it. However, we will be introducing a step very low in the method stack that might modify a sparse-index to become a full index in the case that our queries venture inside a sparse-directory entry. This change only removes the 'const' qualifiers that are necessary for the following change which will actually modify the implementation of index_name_stage_pos(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:00 -07:00
Derrick Stolee	839a66349e	sparse-index: API protection strategy Edit and expand the sparse-index design document with the plan for guarding index operations with ensure_full_index(). Notably, the plan has changed to not have an expand_to_path() method in favor of checking for a sparse-directory hit inside of the index_path_pos() API. The changes that follow this one will incrementally add ensure_full_index() guards to iterations over all cache entries. Some iterations over the cache entries are not protected due to a few categories listed in the document. Since these are not being modified, here is a short list of the files and methods that will not receive these guards: Looking for non-zero stage: * builtin/add.c:chmod_pathspec() * builtin/merge.c:count_unmerged_entries() * merge-ort.c:record_conflicted_index_entries() * read-cache.c:unmerged_index() * rerere.c:check_one_conflict(), find_conflict(), rerere_remaining() * revision.c:prepare_show_merge() * sequencer.c:append_conflicts_hint() * wt-status.c:wt_status_collect_changes_initial() Looking for submodules: * builtin/submodule--helper.c:module_list_compute() * submodule.c: several methods * worktree.c:validate_no_submodules() Part of the index API: * name-hash.c: lazy init methods * preload-index.c:preload_thread(), preload_index() * read-cache.c: file format methods Checking for correct order of cache entries: * read-cache.c:check_ce_order() Ignores SKIP_WORKTREE entries or already aware: * unpack-trees.c:mark_new_skip_worktree() * wt-status.c:wt_status_check_sparse_checkout() Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:45:34 -07:00
Junio C Hamano	54a3917115	The ninth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 15:28:53 -07:00
Junio C Hamano	e0d4a63c09	Merge branch 'vs/completion-with-set-u' The command-line completion script (in contrib/) had a couple of references that would have given a warning under the "-u" (nounset) option. * vs/completion-with-set-u: completion: audit and guard $GIT_* against unset use	2021-04-13 15:28:53 -07:00
Junio C Hamano	e6545201ad	Merge branch 'ab/detox-config-gettext' The last remnant of gettext-poison has been removed. * ab/detox-config-gettext: config.c: remove last remnant of GIT_TEST_GETTEXT_POISON	2021-04-13 15:28:53 -07:00
Junio C Hamano	a9414b86ac	Merge branch 'gk/gitweb-redacted-email' "gitweb" learned "e-mail privacy" feature to redact strings that look like e-mail addresses on various pages. * gk/gitweb-redacted-email: gitweb: add "e-mail privacy" feature to redact e-mail addresses	2021-04-13 15:28:52 -07:00
Junio C Hamano	8446b388b1	Merge branch 'cc/test-helper-bloom-usage-fix' Usage message fix for a test helper. * cc/test-helper-bloom-usage-fix: test-bloom: fix missing 'bloom' from usage string	2021-04-13 15:28:52 -07:00
Junio C Hamano	2279289e95	Merge branch 'ab/send-email-validate-errors' Clean-up codepaths that implements "git send-email --validate" option and improves the message from it. * ab/send-email-validate-errors: git-send-email: improve --validate error output git-send-email: refactor duplicate $? checks into a function git-send-email: test full --validate output	2021-04-13 15:28:51 -07:00
Junio C Hamano	4c6ac2da2c	Merge branch 'tb/precompose-prefix-simplify' Streamline the codepath to fix the UTF-8 encoding issues in the argv[] and the prefix on macOS. * tb/precompose-prefix-simplify: macOS: precompose startup_info->prefix precompose_utf8: make precompose_string_if_needed() public	2021-04-13 15:28:51 -07:00
Junio C Hamano	1d5fbd45c4	Merge branch 'fm/user-manual-use-preface' Doc update to improve git.info * fm/user-manual-use-preface: user-manual.txt: assign preface an id and a title	2021-04-13 15:28:51 -07:00
Junio C Hamano	7b55441db1	Merge branch 'ab/perl-do-not-abuse-map' Perl critique. * ab/perl-do-not-abuse-map: git-send-email: replace "map" in void context with "for"	2021-04-13 15:28:50 -07:00
Junio C Hamano	0623669fc6	Merge branch 'tb/pack-preferred-tips-to-give-bitmap' A configuration variable has been added to force tips of certain refs to be given a reachability bitmap. * tb/pack-preferred-tips-to-give-bitmap: builtin/pack-objects.c: respect 'pack.preferBitmapTips' t/helper/test-bitmap.c: initial commit pack-bitmap: add 'test_bitmap_commits()' helper	2021-04-13 15:28:50 -07:00
Junio C Hamano	f63add4aa8	Merge branch 'jk/ref-filter-segfault-fix' A NULL-dereference bug has been corrected in an error codepath in "git for-each-ref", "git branch --list" etc. * jk/ref-filter-segfault-fix: ref-filter: fix NULL check for parse object failure	2021-04-13 15:28:50 -07:00
Ævar Arnfjörð Bjarmason	f6d25d7878	api docs: document that BUG() emits a trace2 error event Correct documentation added in `e544221d97` (trace2: Documentation/technical/api-trace2.txt, 2019-02-22) to state that calling BUG() also emits an "error" event. See `ee4512ed48` (trace2: create new combined trace facility, 2019-02-22) for the initial implementation. The BUG() function did not emit an event then however, that was only changed later in `0a9dde4a04` (usage: trace2 BUG() invocations, 2021-02-05), that commit changed the code, but didn't update any of the docs. Let's also add a cross-reference from api-error-handling.txt. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:57:13 -07:00
Ævar Arnfjörð Bjarmason	4bf0c6f38f	api docs: document BUG() in api-error-handling.txt When the BUG() function was added in `d8193743e0` (usage.c: add BUG() function, 2017-05-12) these docs added in `1f23cfe0ef` (doc: document error handling functions and conventions, 2014-12-03) were not updated. Let's do that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:56:58 -07:00
Ævar Arnfjörð Bjarmason	c00c7382dd	usage.c: don't copy/paste the same comment three times In `ee4512ed48` (trace2: create new combined trace facility, 2019-02-22) we started with two copies of this comment, `0ee10fd129` (usage: add trace2 entry upon warning(), 2020-11-23) added a third. Let's instead add an earlier comment that applies to all these mostly-the-same functions. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:56:28 -07:00
Ævar Arnfjörð Bjarmason	feeb03bce6	tests: remove all uses of test_i18cmp Finish the removal I started in `1108cea7f8` (tests: remove most uses of test_i18ncmp, 2021-02-11). At that time the function wasn't removed due to disruption with in-flight changes, remove the occurrences that have landed since then. As of writing this there are no test_i18ncmp uses between "master" and "seen", so let's also remove the function to finally put it to rest. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:41:24 -07:00
Jeff King	c1fa951d7e	revision: avoid parsing with --exclude-promisor-objects When --exclude-promisor-objects is given, before traversing any objects we iterate over all of the objects in any promisor packs, marking them as UNINTERESTING and SEEN. We turn the oid we get from iterating the pack into an object with parse_object(), but this has two problems: - it's slow; we are zlib inflating (and reconstructing from deltas) every byte of every object in the packfile - it leaves the tree buffers attached to their structs, which means our heap usage will grow to store every uncompressed tree simultaneously. This can be gigabytes. We can obviously fix the second by freeing the tree buffers after we've parsed them. But we can observe that the function doesn't look at the object contents at all! The only reason we call parse_object() is that we need a "struct object" on which to set the flags. There are two options here: - we can look up just the object type via oid_object_info(), and then call the appropriate lookup_foo() function - we can call lookup_unknown_object(), which gives us an OBJ_NONE struct (which will get auto-converted later by object_as_type() via calls to lookup_commit(), etc). The first one is closer to the current code, but we do pay the price to look up the type for each object. The latter should be more efficient in CPU, though it wastes a little bit of memory (the "unknown" object structs are a union of all object types, so some of the structs are bigger than they need to be). It also runs the risk of triggering a latent bug in code that calls lookup_object() directly but isn't ready to handle OBJ_NONE (such code would already be buggy, but we use lookup_unknown_object() infrequently enough that it might be hiding). I went with the second option here. I don't think the risk is high (and we'd want to find and fix any such bugs anyway), and it should be more efficient overall. The new tests in p5600 show off the improvement (this is on git.git): Test HEAD^ HEAD ------------------------------------------------------------------------------- 5600.5: count commits 0.37(0.37+0.00) 0.38(0.38+0.00) +2.7% 5600.6: count non-promisor commits 11.74(11.37+0.37) 0.04(0.03+0.00) -99.7% The improvement is particularly big in this script because _every_ object in the newly-cloned partial repo is a promisor object. So after marking them all, there's nothing left to traverse. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:22:37 -07:00
Jeff King	45a187cc34	lookup_unknown_object(): take a repository argument All of the other lookup_foo() functions take a repository argument, but lookup_unknown_object() was never converted, and it uses the_repository internally. Let's fix that. We could leave a wrapper that uses the_repository, but there aren't that many calls, so we'll just convert them all. I looked briefly at each site to see if we had a repository struct (besides the_repository) we could pass, but none of them do (so this conversion to pass the_repository is a pure noop in each case, though it does take us one step closer to eventually getting rid of the_repository). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:18:46 -07:00
Jeff King	fcc07e980b	is_promisor_object(): free tree buffer after parsing To get the list of all promisor objects, we not only include all objects in promisor packs, but also parse each of those objects to see which objects they reference. After parsing a tree object, the tree->buffer field will remain populated until we explicitly free it. So in a partial clone of blob:none, for example, we are essentially reading every tree in the repository (since they're all in the initial promisor pack), and keeping all of their uncompressed contents in memory at once. This patch frees the tree buffers after we've finished marking all of their reachable objects. We shouldn't need to do this for any other object type. While we are using some extra memory to store the structs, no other object type stores the whole contents in its parsed form (we do sometimes hold on to commit buffers, but less so these days due to commit graphs, plus most commands which care about promisor objects turn off the save_commit_buffer global). Even for a moderate-sized repository like git.git, this patch drops the peak heap (as measured by massif) for git-fsck from ~1.7GB to ~138MB. Fsck is a good candidate for measuring here because it doesn't interact with the promisor code except to call is_promisor_object(), so we can isolate just this problem. The added perf test shows only a tiny improvement on my machine for git.git, since 1.7GB isn't enough to cause any real memory pressure: Test HEAD^ HEAD -------------------------------------------------------------------------------- 5600.4: fsck 21.26(20.90+0.35) 20.84(20.79+0.04) -2.0% With linux.git the absolute change is a bit bigger, though still a small percentage: Test HEAD^ HEAD ----------------------------------------------------------------------------- 5600.4: fsck 262.26(259.13+3.12) 254.92(254.62+0.29) -2.8% I didn't have the patience to run it under massif with linux.git, but it's probably on the order of about 14GB improvement, since that's the sum of the sizes of all of the uncompressed trees (but still isn't enough to create memory pressure on this particular machine, which has 64GB of RAM). Smaller machines would probably see a bigger effect on runtime (and sadly our perf suite does not measure peak heap). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:16:39 -07:00
Han-Wen Nienhuys	2a2112a429	refs: print errno for read_raw_ref if GIT_TRACE_REFS is set The ref backend API uses errno as a sideband error channel. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 14:42:37 -07:00
Han-Wen Nienhuys	61a7660516	reftable: document an alternate cleanup method on Windows The new method uses the update_index counter, which isn't susceptible to clock inaccuracies. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 14:29:44 -07:00
Ævar Arnfjörð Bjarmason	4f4d2017a3	svn tests: refactor away a "set -e" in test body Refactor a test added in `83c9433e67` (git-svn: support for git-svn propset, 2014-12-07) to avoid using "set -e" in the test body. Let's move this into a setup test using "test_expect_success" instead. While I'm at it refactor: * Repeated "mkdir" to "mkdir -p" * Uses of "touch" to creating the files with ">" instead * The "rm -rf" at the end to happen in a "test_when_finished" Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 14:10:51 -07:00
Ævar Arnfjörð Bjarmason	88fce1219e	svn tests: remove legacy re-setup from init-clone test Remove the immediate "rm -rf .git" from the start of this test. This was added back in `41337e22f0` (git-svn: add tests for command-line usage of init and clone commands, 2007-11-17) when there was a "trash" directory shared by all the tests, but ever since `abc5d372ec` (Enable parallel tests, 2008-08-08) we've had per-test trash directories. So this setup can simply be removed. We could use TEST_NO_CREATE_REPO=true, but I don't think it's worth the effort to go out of our way to be different. It doesn't matter that we now have a redundant .git at the top-level. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 14:10:50 -07:00
Jeff King	8e118e8490	pack-objects: update "nr_seen" progress based on pack-reused count When serving a clone or fetch with bitmaps, after deciding which objects need to be sent our "pack reuse" mechanism kicks in: we try to send more-or-less verbatim a bunch of objects from the beginning of the bitmapped packfile without even adding them to the to_pack.objects array. After deciding which objects will be in the "reused" portion, we update nr_result to account for those, and then trigger display_progress() to show the user (who is undoubtedly dazzled that we managed to enumerate so many objects so quickly). But then something confusing happens: the "Enumerating objects" progress meter jumps _backwards_, counting up from zero the number of objects we actually add into to_pack.objects. This worked correctly once upon a time, but was broken in `5af050437a` (pack-objects: show some progress when counting kept objects, 2018-04-15), when the latter half of that progress meter switched to using a separate nr_seen counter, rather than nr_result. Nobody noticed for two reasons: - prior to the pack-reuse fixes from `a14aebeac3` (Merge branch 'jk/packfile-reuse-cleanup', 2020-02-14), the reuse code almost never kicked in anyway - the output looks _kind of_ correct. The "backwards" moment is hard to catch, because we overwrite the old progress number with the new one, and the larger number is displayed only for a second. So unless you look at that exact second, you just see the much smaller value, counting up to the number of non-reused objects (though of course if you catch it in stderr, or look at GIT_TRACE_PACKET from a server with bitmaps, you can see both values). This smaller output isn't wrong per se, but isn't counting what we ever intended to. We should give the user the whole number of objects we considered (which, as per 5af050437a's original purpose, is already _not_ a count of what goes into to_pack.objects). The follow-on "Counting objects" meter shows the actual number of objects we feed into that array. We can easily fix this by bumping (and showing) nr_seen for the pack-reused objects. When the included test is run without this patch, the second pack-objects invocation produces "Enumerating objects: 1" to show the one loose object, even though the resulting pack has hundreds of objects in it. With it, we jump to "Enumerating objects: 674" after deciding on reuse, and then "675" when we add in the loose object. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 11:31:30 -07:00
Andrzej Hunt	c1ea48a8f7	merge-ort: only do pointer arithmetic for non-empty lists versions could be an empty string_list. In that case, versions->items is NULL, and we shouldn't be trying to perform pointer arithmetic with it (as that results in undefined behaviour). Moreover we only use the results of this calculation once when calling QSORT. Therefore we choose to skip creating relevant_entries and call QSORT directly with our manipulated pointers (but only if there's data requiring sorting). This lets us avoid abusing the string_list API, and saves us from having to explain why this abuse is OK. Finally, an assertion is added to make sure that write_tree() is called with a valid offset. This issue has probably existed since: `ee4012dcf9` (merge-ort: step 2 of tree writing -- function to create tree object, 2020-12-13) But it only started occurring during tests since tests started using merge-ort: `f3b964a07e` (Add testing with merge-ort merge strategy, 2021-03-20) For reference - here's the original UBSAN commit that implemented this check, it sounds like this behaviour isn't actually likely to cause any issues (but we might as well fix it regardless): https://reviews.llvm.org/D67122 UBSAN output from t3404 or t5601: merge-ort.c:2669:43: runtime error: applying zero offset to null pointer #0 0x78bb53 in write_tree merge-ort.c:2669:43 #1 0x7856c9 in process_entries merge-ort.c:3303:2 #2 0x782317 in merge_ort_nonrecursive_internal merge-ort.c:3744:2 #3 0x77feef in merge_incore_nonrecursive merge-ort.c:3853:2 #4 0x6f6a5c in do_recursive_merge sequencer.c:640:3 #5 0x6f6a5c in do_pick_commit sequencer.c:2221:9 #6 0x6ef055 in single_pick sequencer.c:4814:9 #7 0x6ef055 in sequencer_pick_revisions sequencer.c:4867:10 #8 0x4fb392 in run_sequencer revert.c:225:9 #9 0x4fa5b0 in cmd_revert revert.c:235:8 #10 0x42abd7 in run_builtin git.c:453:11 #11 0x429531 in handle_builtin git.c:704:3 #12 0x4282fb in run_argv git.c:771:4 #13 0x4282fb in cmd_main git.c:902:19 #14 0x524b63 in main common-main.c:52:11 #15 0x7fc2ca340349 in __libc_start_main (/lib64/libc.so.6+0x24349) #16 0x4072b9 in _start start.S:120 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior merge-ort.c:2669:43 in Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 10:38:10 -07:00
Patrick Steinhardt	9a2a4f9544	list-objects: support filtering by tag and commit Object filters currently only support filtering blobs or trees based on some criteria. This commit lays the foundation to also allow filtering of tags and commits. No change in behaviour is expected from this commit given that there are no filters yet for those object types. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 09:35:50 -07:00
Ævar Arnfjörð Bjarmason	414abf159f	docs: fix linting issues due to incorrect relative section order Re-order the sections of a few manual pages to be consistent with the entirety of the rest of our documentation. This allows us to remove the just-added whitelist of "bad" order from lint-man-section-order.perl. I'm doing that this way around so that code will be easy to dig up if we'll need it in the future. I've intentionally not added some other sections such as EXAMPLES to the list of known sections. If we were to add that we'd find some out of order. Perhaps we'll want to order those consistently as well in the future, at which point whitelisting some of them might become handy again. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:36:34 -07:00
Ævar Arnfjörð Bjarmason	ea8b9271b1	doc lint: lint relative section order Add a linting script to check the relative order of the sections in the documentation. We should have NAME, then SYNOPSIS, DESCRIPTION, OPTIONS etc. in that order. That holds true throughout our documentation, except for a few exceptions which are hardcoded in the linting script. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:36:34 -07:00
Ævar Arnfjörð Bjarmason	cafd9828e8	doc lint: lint and fix missing "GIT" end sections Lint for and fix the three manual pages that were missing the standard "Part of the linkgit:git[1] suite" end section. We only do this for the man[157] section documents (we don't have anything outside those sections), not files to be included, howto *.txt files etc. We could also add this to the existing (and then renamed) lint-gitlink.perl, but I'm not doing that here. Obviously all of that fits in one script, but I think for something like this that's a one-off script with global variables it's much harder to follow when a large part of your script is some if/else or keeping/resetting of state simply to work around the script doing two things instead of one. Especially because in this case this script wants to process the file as one big string, but lint-gitlink.perl wants to look at it one line at a time. We could also consolidate this whole thing and t/check-non-portable-shell.pl, but that one likes to join lines as part of its shell parsing. So let's just add another script, whole scaffolding is basically: use strict; use warnings; sub report { ... } my $code = 0; while (<>) { ... } exit $code; We'd spend more lines effort trying to consolidate them than just copying that around. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:36:34 -07:00
Ævar Arnfjörð Bjarmason	d2c9908076	doc lint: fix bugs in, simplify and improve lint script The lint-gitlink.perl script added in `ab81411ced` (ci: validate "linkgit:" in documentation, 2016-05-04) was more complex than it needed to be. It: - Was using File::Find to recursively find .txt files in Documentation/, let's instead use the Makefile as a source of truth for .txt files, and pass it down to the script. - We now don't lint linkgit:* in RelNotes/* or technical/, which we shouldn't have been doing in the first place anyway. - When the doc-diff script was added in `beb188e22a` (add a script to diff rendered documentation, 2018-08-06) we started sometimes having a "git worktree" under Documentation/. This tree contains a full checkout of git.git, as a result the "lint" script would recurse into that, and lint any .txt file found in that entire repository. In practice the only in-tree "linkgit" outside of the Documentation/ tree is contrib/contacts/git-contacts.txt and contrib/subtree/git-subtree.txt, so this wouldn't emit any errors Now we instead simply trust the Makefile to give us *.txt files. Since the Makefile also knows what sections each page should be in we don't have to open the files ourselves and try to parse that out. As a bonus this will also catch bugs with the section line in the files themselves being incorrect. The structure of the new script is mostly based on t/check-non-portable-shell.pl. As an added bonus it will also use pos() to print where the problems it finds are, e.g. given an issue like: diff --git a/Documentation/git-cherry.txt b/Documentation/git-cherry.txt [...] and line numbers. git-cherry therefore detects when commits have been -"copied" by means of linkgit:git-cherry-pick[1], linkgit:git-am[1] or -linkgit:git-rebase[1]. +"copied" by means of linkgit:git-cherry-pick[2], linkgit:git-am[3] or +linkgit:git-rebase[4]. We'll now emit: git-cherry.txt:20: error: git-cherry-pick[2]: wrong section (should be 1), shown with 'HERE' below: git-cherry.txt:20: '"copied" by means of linkgit:git-cherry-pick[2]' <-- HERE git-cherry.txt:20: error: git-am[3]: wrong section (should be 1), shown with 'HERE' below: git-cherry.txt:20: '"copied" by means of linkgit:git-cherry-pick[2], linkgit:git-am[3]' <-- HERE git-cherry.txt:21: error: git-rebase[4]: wrong section (should be 1), shown with 'HERE' below: git-cherry.txt:21: 'linkgit:git-rebase[4]' <-- HERE Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:36:34 -07:00
Ævar Arnfjörð Bjarmason	3951eeb6d9	doc lint: Perl "strict" and "warnings" in lint-gitlink.perl Amend this script added in `ab81411ced` (ci: validate "linkgit:" in documentation, 2016-05-04) to pass under "use strict", and add a "use warnings" for good measure. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:36:34 -07:00
Ævar Arnfjörð Bjarmason	19bcc73e70	Documentation/Makefile: make doc.dep dependencies a variable again Re-introduce a variable to declare what .txt files need to be considered for the purposes of scouring files to generate a dependency graph of includes. When doc.dep was introduced in `a5ae8e64cf` (Fix documentation dependency generation., 2005-11-07) we had such a variable called TEXTFILES, but it was refactored away just a few commits after that in `fb612d54c1` (Documentation: fix dependency generation., 2005-11-07). I'm planning to add more wildcards here, so let's bring it back. I'm not calling it TEXTFILES because we e.g. don't consider Documentation/technical/.txt when generating the graph (they don't use includes). Let's instead call it DOC_DEP_TXT. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:36:34 -07:00
Ævar Arnfjörð Bjarmason	824c621b76	Documentation/Makefile: make $(wildcard howto/.txt) a var Refactor occurrences of $(wildcard howto/.txt) into a single HOWTO_TXT variable for readability and consistency. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:36:34 -07:00
Ævar Arnfjörð Bjarmason	e5b32bffd1	rebase: don't override --no-reschedule-failed-exec with config Fix a bug in how --no-reschedule-failed-exec interacts with rebase.rescheduleFailedExec=true being set in the config. Before this change the --no-reschedule-failed-exec config option would be overridden by the config. This bug happened because of the particulars of how "rebase" works v.s. most other git commands when it comes to parsing options and config: When we read the config and parse the CLI options we correctly prefer the --no-reschedule-failed-exec option over rebase.rescheduleFailedExec=true in the config. So far so good. However the --reschedule-failed-exec option doesn't take effect when the rebase starts (we'd just create a ".git/rebase-merge/reschedule-failed-exec" file if it was true). It only takes effect when the exec command fails, at which point we'll reschedule the failed "exec" command. Since we only wrote out the positive ".git/rebase-merge/reschedule-failed-exec" under --reschedule-failed-exec, but nothing with --no-reschedule-failed-exec we'll forget that we asked not to reschedule failed "exec", and would happily re-read the config and see that rebase.rescheduleFailedExec=true is set. So the config will effectively override the user having explicitly disabled the option on the command-line. Even more confusingly: Since rebase accepts different options based on its state there wasn't even a way to get around this with "rebase --continue --no-reschedule-failed-exec" (but you could of course set the config with "rebase -c ..."). I think the least bad way out of this is to declare that for such options and config whatever we decide at the beginning of the rebase goes. So we'll now always create either a "reschedule-failed-exec" or a "no-reschedule-failed-exec file at the start, not just the former if we decided we wanted the feature. With this new worldview you can no longer change the setting once a rebase has started except by manually removing the state files discussed above. I think making it work like that is the the least confusing thing we can do. In the future we might want to learn to change the setting in the middle by combining "--edit-todo" with "--[no-]reschedule-failed-exec", we currently don't support combining those options, or any other way to change the state in the middle of the rebase short of manually editing the files in ".git/rebase-merge/*". The bug being fixed here originally came about because of a combination of the behavior of the code added in `d421afa0c6` (rebase: introduce --reschedule-failed-exec, 2018-12-10) and the addition of the config variable in `969de3ff0e` (rebase: add a config option to default to --reschedule-failed-exec, 2018-12-10). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:23:49 -07:00
Ævar Arnfjörð Bjarmason	cd663df710	rebase tests: camel-case rebase.rescheduleFailedExec consistently Fix a test added in `906b63942a` (rebase --am: ignore rebase.rescheduleFailedExec, 2019-07-01) to camel-case the configuration variable. This doesn't change the behavior of the test, it's merely to help its human readers. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:23:48 -07:00
Patrick Steinhardt	628d81be6c	list-objects: move tag processing into its own function Move processing of tags into its own function to make the logic easier to extend when we're going to implement filtering for tags. No change in behaviour is expected from this commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:03:20 -07:00
Patrick Steinhardt	b2025da38b	revision: mark commit parents as NOT_USER_GIVEN The NOT_USER_GIVEN flag of an object marks whether a flag was explicitly provided by the user or not. The most important use case for this is when filtering objects: only objects that were not explicitly requested will get filtered. The flag is currently only set for blobs and trees, which has been fine given that there are no filters for tags or commits currently. We're about to extend filtering capabilities to add object type filter though, which requires us to set up the NOT_USER_GIVEN flag correctly -- if it's not set, the object wouldn't get filtered at all. Mark unseen commit parents as NOT_USER_GIVEN when processing parents. Like this, explicitly provided parents stay user-given and thus unfiltered, while parents which get loaded as part of the graph walk can be filtered. This commit shouldn't have any user-visible impact yet as there is no logic to filter commits yet. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:03:20 -07:00
Patrick Steinhardt	a812789c26	uploadpack.txt: document implication of `uploadpackfilter.allow` When `uploadpackfilter.allow` is set to `true`, it means that filters are enabled by default except in the case where a filter is explicitly disabled via `uploadpackilter.<filter>.allow`. This option will not only enable the currently supported set of filters, but also any filters which get added in the future. As such, an admin which wants to have tight control over which filters are allowed and which aren't probably shouldn't ever set `uploadpackfilter.allow=true`. Amend the documentation to make the ramifications more explicit so that admins are aware of this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-10 23:03:19 -07:00
Jonathan Tan	6871d0cec6	fetch-pack: refactor command and capability write A subsequent commit will need this functionality independent of the rest of send_fetch_request(), so put this into its own function. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 21:50:22 -07:00
Jonathan Tan	57c3451b2e	fetch-pack: refactor add_haves() A subsequent commit will need part, but not all, of the functionality in add_haves(), so move some of its functionality to its sole caller send_fetch_request(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 21:50:21 -07:00
Jonathan Tan	8102570374	fetch-pack: refactor process_acks() A subsequent commit will need part, but not all, of the functionality in process_acks(), so move some of its functionality to its sole caller do_fetch_pack_v2(). As a side effect, the resulting code is also shorter. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 21:50:21 -07:00
Junio C Hamano	6db01a7308	Merge branch 'jt/fetch-pack-request-fix' into jt/push-negotiation * jt/fetch-pack-request-fix: fetch-pack: buffer object-format with other args	2021-04-08 21:50:10 -07:00
Jonathan Tan	81ed96a9b2	fetch-pack: buffer object-format with other args In send_fetch_request(), "object-format" is written directly to the file descriptor, as opposed to the other arguments, which are buffered. Buffer "object-format" as well. "object-format" must be buffered; in particular, it must appear after "command=fetch" in the request. This divergence was introduced in `4b831208bb` ("fetch-pack: parse and advertise the object-format capability", 2020-05-27), perhaps as an oversight (the surrounding code at the point of this commit has already been using a request buffer.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 21:49:47 -07:00
Georgios Kontaxis	0996dd3d6d	gitweb: add "e-mail privacy" feature to redact e-mail addresses Gitweb extracts content from the Git log and makes it accessible over HTTP. As a result, e-mail addresses found in commits are exposed to web crawlers and they may not respect robots.txt. This can result in unsolicited messages. Introduce an 'email-privacy' feature which redacts e-mail addresses from the generated HTML content. Specifically, obscure addresses retrieved from the the author/committer and comment sections of the Git log. The feature is off by default. This feature does not prevent someone from downloading the unredacted commit log, e.g., by cloning the repository, and extracting information from it. It aims to hinder the low- effort, bulk collection of e-mail addresses by web crawlers. Signed-off-by: Georgios Kontaxis <geko1702+commits@99rst.org> Acked-by: Eric Wong <e@80x24.org> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 15:54:26 -07:00
SZEDER Gábor	56550ea718	Makefile: add missing dependencies of 'config-list.h' We auto-generate the list of supported configuration variables from 'Documentation/config/.txt', and that list used to be created by the 'generate-cmdlist.sh' helper script and stored in the 'command-list.h' header. Commit `709df95b78` (help: move list_config_help to builtin/help, 2020-04-16) extracted this into a dedicated 'generate-configlist.sh' script and 'config-list.h' header, and added a new target in the 'Makefile' as well, but while doing so it forgot to extract the dependencies of the latter. Consequently, since then 'config-list.h' is not re-generated when 'Documentation/config/.txt' is updated, while 'command-list.h' is re-generated unnecessarily: $ touch Documentation/config/log.txt $ make -j4 GEN command-list.h CC help.o AR libgit.a Fix this and list all config-related documentation files as dependencies of 'config-list.h' and remove them from the dependencies of 'command-list.h'. $ touch Documentation/config/log.txt $ make GEN config-list.h CC builtin/help.o LINK git Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 15:04:58 -07:00
Matheus Tavares	d5f4b8260f	rm: honor sparse checkout patterns `git add` refrains from adding or updating index entries that are outside the current sparse checkout, but `git rm` doesn't follow the same restriction. This is somewhat counter-intuitive and inconsistent. So make `rm` honor the sparsity rules and advise on how to remove SKIP_WORKTREE entries just like `add` does. Also add some tests for the new behavior. Suggested-by: Elijah Newren <newren@gmail.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 14:18:03 -07:00
Matheus Tavares	a20f70478f	add: warn when asked to update SKIP_WORKTREE entries `git add` already refrains from updating SKIP_WORKTREE entries, but it silently exits with zero code when it is asked to do so. Instead, let's warn the user and display a hint on how to update these entries. Note that we only warn the user whey they give a pathspec item that matches no eligible path for updating, but it does match one or more SKIP_WORKTREE entries. A warning was chosen over erroring out right away to reproduce the same behavior `add` already exhibits with ignored files. This also allow users to continue their workflow without having to invoke `add` again with only the eligible paths (as those will have already been added). Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 14:18:03 -07:00
Matheus Tavares	b243012cb3	refresh_index(): add flag to ignore SKIP_WORKTREE entries refresh_index() doesn't update SKIP_WORKTREE entries, but it still matches them against the given pathspecs, marks the matches on the seen[] array, check if unmerged, etc. In the following patch, one caller will need refresh_index() to ignore SKIP_WORKTREE entries entirely, so add a flag that implements this behavior. While we are here, also realign the REFRESH_* flags and convert the hex values to the more natural bit shift format, which makes it easier to spot holes. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 14:18:03 -07:00
Matheus Tavares	719630eb48	pathspec: allow to ignore SKIP_WORKTREE entries on index matching Add a new enum parameter to `add_pathspec_matches_against_index()` and `find_pathspecs_matching_against_index()`, allowing callers to specify whether these function should attempt to match SKIP_WORKTREE entries or not. This will be used in a future patch to make `git add` display a warning when it is asked to update SKIP_WORKTREE entries. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 14:18:03 -07:00
Matheus Tavares	d73dbafc2c	add: make --chmod and --renormalize honor sparse checkouts Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 14:18:03 -07:00
Matheus Tavares	6594afc3cc	t3705: add tests for `git add` in sparse checkouts We already have a couple tests for `add` with SKIP_WORKTREE entries in t7012, but these only cover the most basic scenarios. As we will be changing how `add` deals with sparse paths in the subsequent commits, let's move these two tests to their own file and add more test cases for different `add` options and situations. This also demonstrates two options that don't currently respect SKIP_WORKTREE entries: `--chmod` and `--renormalize`. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 14:18:03 -07:00
Matheus Tavares	4e95698349	add: include magic part of pathspec on --refresh error When `git add --refresh <pathspec>` doesn't find any matches for the given pathspec, it prints an error message using the `match` field of the `struct pathspec_item`. However, this field doesn't contain the magic part of the pathspec. Instead, let's use the `original` field. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 14:18:03 -07:00
Atharva Raykar	a437390310	userdiff: add support for Scheme Add a diff driver for Scheme-like languages which recognizes top level and local `define` forms, whether it is a function definition, binding, syntax definition or a user-defined `define-xyzzy` form. Also supports R6RS `library` forms, `module` forms along with class and struct declarations used in Racket (PLT Scheme). Alternate "def" syntax such as those in Gerbil Scheme are also supported, like defstruct, defsyntax and so on. The rationale for picking `define` forms for the hunk headers is because it is usually the only significant form for defining the structure of the program, and it is a common pattern for schemers to have local function definitions to hide their visibility, so it is not only the top level `define`'s that are of interest. Schemers also extend the language with macros to provide their own define forms (for example, something like a `define-test-suite`) which is also captured in the hunk header. Since it is common practice to extend syntax with variants of a form like `module+`, `class*` etc, those have been supported as well. The word regex is a best-effort attempt to conform to R7RS[1] valid identifiers, symbols and numbers. [1] https://small.r7rs.org/attachment/r7rs.pdf (section 2.1) Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 13:56:09 -07:00
Junio C Hamano	89b43f80a5	The eighth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 13:23:26 -07:00
Junio C Hamano	14cc08de23	Merge branch 'ab/make-tags-quiet' Generate [ec]tags under $(QUIET_GEN). * ab/make-tags-quiet: Makefile: add QUIET_GEN to "tags" and "TAGS" targets	2021-04-08 13:23:26 -07:00
Junio C Hamano	bde35a2a93	Merge branch 'rs/daemon-sanitize-dir-sep' "git daemon" has been tightened against systems that take backslash as directory separator. * rs/daemon-sanitize-dir-sep: daemon: sanitize all directory separators	2021-04-08 13:23:26 -07:00
Junio C Hamano	1b31224e59	Merge branch 'en/ort-perf-batch-9' The ort merge backend has been optimized by skipping irrelevant renames. * en/ort-perf-batch-9: diffcore-rename: avoid doing basename comparisons for irrelevant sources merge-ort: skip rename detection entirely if possible merge-ort: use relevant_sources to filter possible rename sources merge-ort: precompute whether directory rename detection is needed merge-ort: introduce wrappers for alternate tree traversal merge-ort: add data structures for an alternate tree traversal merge-ort: precompute subset of sources for which we need rename detection diffcore-rename: enable filtering possible rename sources	2021-04-08 13:23:26 -07:00
Junio C Hamano	82fd285e46	Merge branch 'en/sequencer-edit-upon-conflict-fix' "git cherry-pick/revert" with or without "--[no-]edit" did not spawn the editor as expected (e.g. "revert --no-edit" after a conflict still asked to edit the message), which has been corrected. * en/sequencer-edit-upon-conflict-fix: sequencer: fix edit handling for cherry-pick and revert messages	2021-04-08 13:23:26 -07:00
Junio C Hamano	22eee7f455	Merge branch 'll/clone-reject-shallow' "git clone --reject-shallow" option fails the clone as soon as we notice that we are cloning from a shallow repository. * ll/clone-reject-shallow: builtin/clone.c: add --reject-shallow option	2021-04-08 13:23:25 -07:00
Junio C Hamano	e6b971fcf5	Merge branch 'tb/reverse-midx' An on-disk reverse-index to map the in-pack location of an object back to its object name across multiple packfiles is introduced. * tb/reverse-midx: midx.c: improve cache locality in midx_pack_order_cmp() pack-revindex: write multi-pack reverse indexes pack-write.c: extract 'write_rev_file_order' pack-revindex: read multi-pack reverse indexes Documentation/technical: describe multi-pack reverse indexes midx: make some functions non-static midx: keep track of the checksum midx: don't free midx_name early midx: allow marking a pack as preferred t/helper/test-read-midx.c: add '--show-objects' builtin/multi-pack-index.c: display usage on unrecognized command builtin/multi-pack-index.c: don't enter bogus cmd_mode builtin/multi-pack-index.c: split sub-commands builtin/multi-pack-index.c: define common usage with a macro builtin/multi-pack-index.c: don't handle 'progress' separately builtin/multi-pack-index.c: inline 'flags' with options	2021-04-08 13:23:25 -07:00
Ævar Arnfjörð Bjarmason	f08b4013c3	blame tests: simplify userdiff driver test Simplify the test added in `9466e3809d` (blame: enable funcname blaming with userdiff driver, 2020-11-01) to use the --author support recently added in `999cfc4f45` (test-lib functions: add --author support to test_commit, 2021-01-12). We also did not need the full fortran-external-function content. Let's cut it down to just the important parts. I'm modifying it to demonstrate that the fortran-specific userdiff function is in effect by adding "DO NOT MATCH ..." and "AS THE ..." lines surrounding the "RIGHT" one. This is to check that we're using the userdiff "fortran" driver, as opposed to the default driver which would match on those lines as part of the general heuristic of matching a line that doesn't begin with whitespace. The test had also been leaving behind a .gitattributes file for later tests to possibly trip over, let's clean it up with "test_when_finished". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason	b269441be2	blame tests: don't rely on t/t4018/ directory Refactor a test added in `9466e3809d` (blame: enable funcname blaming with userdiff driver, 2020-11-01) so that the blame tests don't rely on stealing the contents of "t/t4018/fortran-external-function". I have another patch series that'll possibly (or not) refactor that file, but having this test inter-dependency makes things simple in any case by making this test more readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason	6cb77966ec	userdiff: remove support for "broken" tests There have been no "broken" tests since `75c3b6b2e8` (userdiff: improve Fortran xfuncname regex, 2020-08-12). Let's remove the test support for them. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason	28e8f0d5e5	userdiff tests: list builtin drivers via test-tool Change the userdiff test to list the builtin drivers via the test-tool, using the new for_each_userdiff_driver() API function. This gets rid of the need to modify this part of the test every time a new pattern is added, see `2ff6c34612` (userdiff: support Bash, 2020-10-22) and `09dad9256a` (userdiff: support Markdown, 2020-05-02) for two recent examples. I only need the "list-builtin-drivers "argument here, but let's add "list-custom-drivers" and "list-drivers" too, just because it's easy. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason	132bf25989	userdiff tests: explicitly test "default" pattern Since `122aa6f9c0` (diff: introduce diff.<driver>.binary, 2008-10-05) the internals of the userdiff.c code have understood a "default" name, which is invoked as userdiff_find_by_name("default") and present in the "builtin_drivers" struct. Let's test for this special case. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason	f12fa9ee6c	userdiff: add and use for_each_userdiff_driver() Refactor the userdiff_find_by_namelen() function so that a new for_each_userdiff_driver() API function does most of the work. This will be useful for the same reason we've got other for_each_*() API functions as part of various APIs, and will be used in a follow-up commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason	82512e008c	userdiff style: normalize pascal regex declaration Declare the pascal pattern consistently with how we declare the others, not having "\n" on one line by itself, but as part of the pattern, and when there are alterations have the "\|" at the start, not end of the line. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:09 -07:00
Ævar Arnfjörð Bjarmason	6d1c9c527e	userdiff style: declare patterns with consistent style Change those patterns which were declared with a regex on the same line as the "PATTERNS()" line to put that regex on the next line, and add missing "/* -- */" separator comments between the pattern and word_regex. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:09 -07:00
Ævar Arnfjörð Bjarmason	ddd164d026	userdiff style: re-order drivers in alphabetical order Address some old code smell and move around the built-in userdiff drivers so they're both in alphabetical order, and now in the same order they appear in the gitattributes(5) documentation. The two started drifting in `be58e70dba` (diff: unify external diff and funcname parsing code, 2008-10-05), and then even further in `80c49c3de2` (color-words: make regex configurable via attributes, 2009-01-17) when the "cpp" pattern was added. There are no functional changes here, and as --color-moved will show only moved existing lines. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:09 -07:00
Ævar Arnfjörð Bjarmason	39e12650d7	config.c: remove last remnant of GIT_TEST_GETTEXT_POISON Remove a use of GIT_TEST_GETTEXT_POISON added in `f276e2a469` (config: improve error message for boolean config, 2021-02-11). This was simultaneously in-flight with my `d162b25f95` (tests: remove support for GIT_TEST_GETTEXT_POISON, 2021-01-20) which removed the rest of the GIT_TEST_GETTEXT_POISON code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 10:54:08 -07:00
Ville Skyttä	c5c0548d79	completion: audit and guard $GIT_* against unset use $GIT_COMPLETION_SHOW_ALL and $GIT_TESTING_ALL_COMMAND_LIST were used without guarding against them being unset, causing errors in nounset (set -u) mode. No other nounset-unsafe $GIT_* usages were found. While at it, remove a superfluous (duplicate) unset guard from $GIT_DIR in __git_find_repo_path. Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 10:45:36 -07:00
Jerry Zhang	c0c2a37ac2	git-apply: allow simultaneous --cached and --3way options "git apply" does not allow "--cached" and "--3way" to be used together, since "--3way" writes conflict markers into the working tree. Allow "git apply" to accept "--cached" and "--3way" at the same time. When a single file auto-resolves cleanly, the result is placed in the index at stage #0 and the command exits with 0 status. For a file that has a conflict which cannot be cleanly auto-resolved, the original contents from common ancestor (stage conflict at the content level, and the command exists with non-zero status, because there is no place (like the working tree) to leave a half-resolved merge for the user to resolve. The user can use `git diff` to view the contents of the conflict, or `git checkout -m -- .` to regenerate the conflict markers in the working directory. Don't attempt rerere in this case since it depends on conflict markers written to file for its database storage and lookup. There would be two main changes required to get rerere working: 1. Allow the rerere api to accept in memory object rather than files, which would allow us to pass in the conflict markers contained in the result from ll_merge(). 2. Rerere can't write to the working directory, so it would have to apply the result to cache stage #0 directly. A flag would be needed to control this. Signed-off-by: Jerry Zhang <jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-07 22:20:33 -07:00
Junio C Hamano	a0dda6023e	The seventh batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-07 16:54:09 -07:00
Junio C Hamano	5644419d04	Merge branch 'ab/fsck-api-cleanup' Fsck API clean-up. * ab/fsck-api-cleanup: fetch-pack: use new fsck API to printing dangling submodules fetch-pack: use file-scope static struct for fsck_options fetch-pack: don't needlessly copy fsck_options fsck.c: move gitmodules_{found,done} into fsck_options fsck.c: add an fsck_set_msg_type() API that takes enums fsck.c: pass along the fsck_msg_id in the fsck_error callback fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from .c to .h fsck.c: give "FOREACH_MSG_ID" a more specific name fsck.c: undefine temporary STR macro after use fsck.c: call parse_msg_type() early in fsck_set_msg_type() fsck.h: re-order and re-assign "enum fsck_msg_type" fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" fsck.c: rename remaining fsck_msg_id "id" to "msg_id" fsck.c: remove (mostly) redundant append_msg_id() function fsck.c: rename variables in fsck_set_msg_type() for less confusion fsck.h: use "enum object_type" instead of "int" fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} fsck.c: refactor and rename common config callback	2021-04-07 16:54:09 -07:00
Junio C Hamano	d637a267d8	Merge branch 'cc/downcase-opt-help' A few option description strings started with capital letters, which were corrected. * cc/downcase-opt-help: column, range-diff: downcase option description	2021-04-07 16:54:09 -07:00
Junio C Hamano	3cf14f88de	Merge branch 'js/security-md' SECURITY.md that is facing individual contributors and end users has been introduced. Also a procedure to follow when preparing embargoed releases has been spelled out. * js/security-md: Document how we do embargoed releases SECURITY: describe how to report vulnerabilities	2021-04-07 16:54:09 -07:00
Junio C Hamano	58840e62a4	Merge branch 'ps/pack-bitmap-optim' Optimize "rev-list --use-bitmap-index --objects" corner case that uses negative tags as the stopping points. * ps/pack-bitmap-optim: pack-bitmap: avoid traversal of objects referenced by uninteresting tag	2021-04-07 16:54:09 -07:00
Junio C Hamano	68e15e0c23	Merge branch 'zh/commit-trailer' "git commit" learned "--trailer <key>[=<value>]" option; together with the interpret-trailers command, this will make it easier to support custom trailers. * zh/commit-trailer: commit: add --trailer option	2021-04-07 16:54:08 -07:00
Junio C Hamano	a548f3e0ad	Merge branch 'js/cmake-vsbuild' CMake update for vsbuild. * js/cmake-vsbuild: cmake(install): include vcpkg dlls cmake: add a preparatory work-around to accommodate `vcpkg` cmake(install): fix double .exe suffixes cmake: support SKIP_DASHED_BUILT_INS	2021-04-07 16:54:08 -07:00
Junio C Hamano	573c5e50ab	Merge branch 'ds/clarify-hashwrite' The hashwrite() API uses a buffering mechanism to avoid calling write(2) too frequently. This logic has been refactored to be easier to understand. * ds/clarify-hashwrite: csum-file: make hashwrite() more readable	2021-04-07 16:54:08 -07:00
Junio C Hamano	642a40019c	Merge branch 'ah/plugleaks' Plug or annotate remaining leaks that trigger while running the very basic set of tests. * ah/plugleaks: transport: also free remote_refs in transport_disconnect() parse-options: don't leak alias help messages parse-options: convert bitfield values to use binary shift init-db: silence template_dir leak when converting to absolute path init: remove git_init_db_config() while fixing leaks worktree: fix leak in dwim_branch() clone: free or UNLEAK further pointers when finished reset: free instead of leaking unneeded ref symbolic-ref: don't leak shortened refname in check_symref()	2021-04-07 16:54:08 -07:00
Ævar Arnfjörð Bjarmason	3994ae510e	bash completion: complete CHERRY_PICK_HEAD When e.g. in a failed cherry pick we did not recognize CHERRY_PICK_HEAD as we do e.g. REBASE_HEAD in a failed rebase let's rectify that. When REBASE_HEAD was added in `fbd7a23237` (rebase: introduce and use pseudo-ref REBASE_HEAD, 2018-02-11) a completion was added for it, but no corresponding completion existed for CHERRY_PICK_HEAD added in `d7e5c0cbfb` (Introduce CHERRY_PICK_HEAD, 2011-02-19). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-07 15:14:51 -07:00
Jerry Zhang	923cd87ac8	git-apply: try threeway first when "--3way" is used The apply_fragments() method of "git apply" can silently apply patches incorrectly if a file has repeating contents. In these cases a three-way merge is capable of applying it correctly in more situations, and will show a conflict rather than applying it incorrectly. However, because the patches apply "successfully" using apply_fragments(), git will never fall back to the merge, even if the "--3way" flag is used, and the user has no way to ensure correctness by forcing the three-way merge method. Change the behavior so that when "--3way" is used, git will always try the three-way merge first and will only fall back to apply_fragments() in cases where blobs are not available or some other error (but not in the case of a merge conflict). Since user-facing results will be different, this has backwards compatibility implications for users depending on the old behavior. In addition, the three-way merge will be slower than direct patch application. Signed-off-by: Jerry Zhang <jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-06 17:11:41 -07:00
Derrick Stolee	a039a1fcf9	maintenance: simplify prefetch logic The previous logic filled a string list with the names of each remote, but instead we could simply run the appropriate 'git fetch' data directly in the remote iterator. Do this for reduced code size, but also because it sets up an upcoming change to use the remote's refspec. This data is accessible from the 'struct remote' data that is now accessible in fetch_remote(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-06 14:23:47 -07:00
Ævar Arnfjörð Bjarmason	ea7811b37e	git-send-email: improve --validate error output Improve the output we emit on --validate error to: * Say "FILE:LINE" instead of "FILE: LINE", to match "grep -n", compiler error messages etc. * Don't say "patch contains a" after just mentioning the filename, just leave it at "FILE:LINE: is longer than[...]. The "contains a" sounded like we were talking about the file in general, when we're actually checking it line-by-line. * Don't just say "rejected by sendemail-validate hook", but combine that with the system_or_msg() output to say what exit code the hook died with. I had an aborted attempt to make the line length checker note all lines that were longer than the limit. I didn't think that was worth the effort, but I've left in the testing change to check that we die as soon as we spot the first long line. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-06 12:57:06 -07:00
Ævar Arnfjörð Bjarmason	d21616c039	git-send-email: refactor duplicate $? checks into a function Refactor the duplicate checking of $? into a function. There's an outstanding series[1] wanting to add a third use of system() in this file, let's not copy this boilerplate anymore when that happens. 1. http://lore.kernel.org/git/87y2esg22j.fsf@evledraar.gmail.com Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-06 12:57:06 -07:00
Ævar Arnfjörð Bjarmason	e585210e1b	git-send-email: test full --validate output Change the tests that grep substrings out of the output to use a full test_cmp, in preparation for improving the output. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-06 12:57:05 -07:00
Christian Couder	dba94e3a85	test-bloom: fix missing 'bloom' from usage string Like 'get_murmur3' and 'generate_filter', 'get_filter_for_commit' is a subcommand of `test-tool bloom` not of `test-tool` itself. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-05 22:54:34 -07:00
Torsten Bögershausen	c7d0e61016	macOS: precompose startup_info->prefix The "prefix" was precomposed for macOS in commit `5c327502` (MacOS: precompose_argv_prefix(), 2021-02-03). However, this commit forgot to update "startup_info->prefix" after precomposing. Move the (possible) precomposition towards the end of setup_git_directory_gently(), so that precompose_string_if_needed() can use git_config_get_bool("core.precomposeunicode") correctly. Keep prefix, startup_info->prefix and GIT_PREFIX_ENVIRONMENT all in sync. And as a result, the prefix no longer needs to be precomposed in git.c Reported-by: Dmitry Torilov <d.torilov@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-05 17:30:36 -07:00
Torsten Bögershausen	5020774aef	precompose_utf8: make precompose_string_if_needed() public commit `5c327502` (MacOS: precompose_argv_prefix(), 2021-02-03) uses the function precompose_string_if_needed() internally. It is only used from precompose_argv_prefix() and therefore static in compat/precompose_utf8.c Expose this function, it will be used in the next commit. While there, allow passing a NULL pointer, which will return NULL. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-05 17:30:04 -07:00
Firmin Martin	fc12b6fdde	user-manual.txt: assign preface an id and a title Two among the three warnings raised by "make git.info" are related to the fact that the preface has not id in user-manual.txt. user-manual.texi:15: warning: empty menu entry name in `* : idm4.' user-manual.texi:141: warning: @unnumbered missing argument This causes asciidoc creating an empty preface and an empty title tag in user-manual.xml which turns to be an empty node in user-manual.texi and git.info. Consequently, one can notice in user-manual.texi and git.info a node named "idm4" in the menu and the navigation bar. In emacs, the first entry of the menu in the git info page is even displayed as empty. This fix will name "Introduction" the preface and assign it an id. The result can be seen in the files: user-manual.{xml, texi, html, pdf} and git.info. For future reference, the diff between old and new user-manual.xml, user-manual.texi, git.info, user-manual.html (converted through html2markdown) and user-manual.pdf (converted through pdftotext) are attached. --- before/user-manual.xml 2021-04-04 03:58:47.758008722 +0200 +++ after/user-manual.xml 2021-04-04 03:56:40.520551163 +0200 @@ -7,8 +7,8 @@ <bookinfo> <title>Git User Manual</title> </bookinfo> -<preface> -<title></title> +<preface id="_introduction"> +<title>Introduction</title> <simpara>Git is a fast distributed revision control system.</simpara> <simpara>This manual is designed to be readable by someone with basic UNIX command-line skills, but no previous knowledge of Git.</simpara> --- before/user-manual.texi 2021-04-04 03:58:47.490005652 +0200 +++ after/user-manual.texi 2021-04-04 03:56:40.520551163 +0200 @@ -7,12 +7,12 @@ * Git: (git). A fast distributed revision control system @end direntry -@node Top, idm4, , (dir) +@node Top, Introduction, , (dir) @documentlanguage en @top Git User Manual @menu -* : idm4. +* Introduction:: * Repositories and Branches:: * Exploring Git history:: * Developing with Git:: @@ -137,8 +137,8 @@ @end detailmenu @end menu -@node idm4, Repositories and Branches, Top, Top -@unnumbered +@node Introduction, Repositories and Branches, Top, Top +@unnumbered Introduction Git is a fast distributed revision control system. @@ -178,7 +178,7 @@ Finally, see @ref{Notes and todo list for this manual} for ways that you can help make this manual more complete. -@node Repositories and Branches, Exploring Git history, idm4, Top +@node Repositories and Branches, Exploring Git history, Introduction, Top @chapter Repositories and Branches @menu --- before/git.info 2021-04-04 03:58:46.557994966 +0200 +++ after/git.info 2021-04-04 03:56:40.520551163 +0200 @@ -7,14 +7,14 @@ END-INFO-DIR-ENTRY -File: git.info, Node: Top, Next: idm4, Up: (dir) +File: git.info, Node: Top, Next: Introduction, Up: (dir) Git User Manual *************** * Menu: -* : idm4. +* Introduction:: * Repositories and Branches:: * Exploring Git history:: * Developing with Git:: @@ -137,7 +137,10 @@ -File: git.info, Node: idm4, Next: Repositories and Branches, Prev: Top, Up: Top +File: git.info, Node: Introduction, Next: Repositories and Branches, Prev: Top, Up: Top + +Introduction +********** Git is a fast distributed revision control system. @@ -174,7 +177,7 @@ that you can help make this manual more complete. -File: git.info, Node: Repositories and Branches, Next: Exploring Git history, Prev: idm4, Up: Top +File: git.info, Node: Repositories and Branches, Next: Exploring Git history, Prev: Introduction, Up: Top 1 Repositories and Branches *********************** @@ -5471,207 +5474,207 @@ ... Tag Table: Node: Top212 -Node: idm43164 -Node: Repositories and Branches4465 ... +Node: Introduction3179 +Node: Repositories and Branches4515 +Node: How to get a Git repository5128 ... End Tag Table --- before/user-manual.html.md 2021-04-04 05:20:55.378695854 +0200 +++ after/user-manual.html.md 2021-04-04 05:21:11.282850802 +0200 @@ -4,6 +4,8 @@ Table of Contents** +Introduction + 1\. Repositories and Branches @@ -278,7 +280,7 @@ Todo list -# +# Introduction Git is a fast distributed revision control system. --- before/user-manual.pdf.txt 2021-04-04 05:28:20.367036836 +0200 +++ after/user-manual.pdf.txt 2021-04-04 05:30:01.680026312 +0200 @@ -487,6 +487,7 @@ vii +Introduction Git is a fast distributed revision control system. This manual is designed to be readable by someone with basic UNIX command-line skills, but no previous knowledge of Git. Chapter 1 and Chapter 2 explain how to fetch and study a project using git—read these chapters to learn how to build and test a Signed-off-by: Firmin Martin <firminmartin24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-03 23:19:04 -07:00
Junio C Hamano	2e36527f23	The sixth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-02 14:43:31 -07:00
Junio C Hamano	8a4394d1c1	Merge branch 'zh/format-patch-fractional-reroll-count' "git format-patch -v<n>" learned to allow a reroll count that is not an integer. * zh/format-patch-fractional-reroll-count: format-patch: allow a non-integral version numbers	2021-04-02 14:43:14 -07:00
Junio C Hamano	861794b60d	Merge branch 'jh/simple-ipc' A simple IPC interface gets introduced to build services like fsmonitor on top. * jh/simple-ipc: t0052: add simple-ipc tests and t/helper/test-simple-ipc tool simple-ipc: add Unix domain socket implementation unix-stream-server: create unix domain socket under lock unix-socket: disallow chdir() when creating unix domain sockets unix-socket: add backlog size option to unix_stream_listen() unix-socket: eliminate static unix_stream_socket() helper function simple-ipc: add win32 implementation simple-ipc: design documentation for new IPC mechanism pkt-line: add options argument to read_packetized_to_strbuf() pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option pkt-line: do not issue flush packets in write_packetized_*() pkt-line: eliminate the need for static buffer in packet_write_gently()	2021-04-02 14:43:14 -07:00
Junio C Hamano	c47679d040	Merge branch 'mt/parallel-checkout-part-1' Preparatory API changes for parallel checkout. * mt/parallel-checkout-part-1: entry: add checkout_entry_ca() taking preloaded conv_attrs entry: move conv_attrs lookup up to checkout_entry() entry: extract update_ce_after_write() from write_entry() entry: make fstat_output() and read_blob_entry() public entry: extract a header file for entry.c functions convert: add classification for conv_attrs struct convert: add get_stream_filter_ca() variant convert: add [async_]convert_to_working_tree_ca() variants convert: make convert_attrs() and convert structs public	2021-04-02 14:43:14 -07:00
Ævar Arnfjörð Bjarmason	b362acf575	git-send-email: replace "map" in void context with "for" While using "map" instead of "for" or "map" instead of "grep" and vice-versa makes for interesting trivia questions when interviewing Perl programmers, it doesn't make for very readable code. Let's refactor this loop initially added in `8fd5bb7f44` (git send-email: add --annotate option, 2008-11-11) to be a for-loop instead. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-02 14:32:29 -07:00
Ævar Arnfjörð Bjarmason	3c80fcb591	Makefile: add QUIET_GEN to "tags" and "TAGS" targets Don't show the very verbose $(FIND_SOURCE_FILES) command on every "make TAGS" invocation. Let's use "generate into temporary and rename to the final file, after seeing the command that generated the output finished successfully" pattern, to avoid leaving a file with an incorrect output generated by a failed command. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 22:23:39 -07:00
Jeff King	3007752461	midx.c: improve cache locality in midx_pack_order_cmp() There is a lot of pointer dereferencing in the pre-image version of 'midx_pack_order_cmp()', which this patch gets rid of. Instead of comparing the pack preferred-ness and then the pack id, both of these checks are done at the same time by using the high-order bit of the pack id to represent whether it's preferred. Then the pack id and offset are compared as usual. This produces the same result so long as there are less than 2^31 packs, which seems like a likely assumption to make in practice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Taylor Blau	38ff7cabb6	pack-revindex: write multi-pack reverse indexes Implement the writing half of multi-pack reverse indexes. This is nothing more than the format describe a few patches ago, with a new set of helper functions that will be used to clear out stale .rev files corresponding to old MIDXs. Unfortunately, a very similar comparison function as the one implemented recently in pack-revindex.c is reimplemented here, this time accepting a MIDX-internal type. An effort to DRY these up would create more indirection and overhead than is necessary, so it isn't pursued here. Currently, there are no callers which pass the MIDX_WRITE_REV_INDEX flag, meaning that this is all dead code. But, that won't be the case for long, since subsequent patches will introduce the multi-pack bitmap, which will begin passing this field. (In midx.c:write_midx_internal(), the two adjacent if statements share a conditional, but are written separately since the first one will eventually also handle the MIDX_WRITE_BITMAP flag, which does not yet exist.) Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Taylor Blau	a587b5a786	pack-write.c: extract 'write_rev_file_order' Existing callers provide the reverse index code with an array of 'struct pack_idx_entry *'s, which is then sorted by pack order (comparing the offsets of each object within the pack). Prepare for the multi-pack index to write a .rev file by providing a way to write the reverse index without an array of pack_idx_entry (which the MIDX code does not have). Instead, callers can invoke 'write_rev_index_positions()', which takes an array of uint32_t's. The ith entry in this array specifies the ith object's (in index order) position within the pack (in pack order). Expose this new function for use in a later patch, and rewrite the existing write_rev_file() in terms of this new function. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Taylor Blau	f894081dea	pack-revindex: read multi-pack reverse indexes Implement reading for multi-pack reverse indexes, as described in the previous patch. Note that these functions don't yet have any callers, and won't until multi-pack reachability bitmaps are introduced in a later patch series. In the meantime, this patch implements some of the infrastructure necessary to support multi-pack bitmaps. There are three new functions exposed by the revindex API: - load_midx_revindex(): loads the reverse index corresponding to the given multi-pack index. - midx_to_pack_pos() and pack_pos_to_midx(): these convert between the multi-pack index and pseudo-pack order. load_midx_revindex() and pack_pos_to_midx() are both relatively straightforward. load_midx_revindex() needs a few functions to be exposed from the midx API. One to get the checksum of a midx, and another to get the .rev's filename. Similar to recent changes in the packed_git struct, three new fields are added to the multi_pack_index struct: one to keep track of the size, one to keep track of the mmap'd pointer, and another to point past the header and at the reverse index's data. pack_pos_to_midx() simply reads the corresponding entry out of the table. midx_to_pack_pos() is the trickiest, since it needs to find an object's position in the psuedo-pack order, but that order can only be recovered in the .rev file itself. This mapping can be implemented with a binary search, but note that the thing we're binary searching over isn't an array of values, but rather a permuted order of those values. So, when comparing two items, it's helpful to keep in mind the difference. Instead of a traditional binary search, where you are comparing two things directly, here we're comparing a (pack, offset) tuple with an index into the multi-pack index. That index describes another (pack, offset) tuple, and it is _those_ two tuples that are compared. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Taylor Blau	b25fd24c00	Documentation/technical: describe multi-pack reverse indexes As a prerequisite to implementing multi-pack bitmaps, motivate and describe the format and ordering of the multi-pack reverse index. The subsequent patch will implement reading this format, and the patch after that will implement writing it while producing a multi-pack index. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Taylor Blau	62f2c1b509	midx: make some functions non-static In a subsequent commit, pack-revindex.c will become responsible for sorting a list of objects in the "MIDX pack order" (which will be defined in the following patch). To do so, it will need to be know the pack identifier and offset within that pack for each object in the MIDX. The MIDX code already has functions for doing just that (nth_midxed_offset() and nth_midxed_pack_int_id()), but they are statically declared. Since there is no reason that they couldn't be exposed publicly, and because they are already doing exactly what the caller in pack-revindex.c will want, expose them publicly so that they can be reused there. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Taylor Blau	9f19161172	midx: keep track of the checksum write_midx_internal() uses a hashfile to write the multi-pack index, but discards its checksum. This makes sense, since nothing that takes place after writing the MIDX cares about its checksum. That is about to change in a subsequent patch, when the optional reverse index corresponding to the MIDX will want to include the MIDX's checksum. Store the checksum of the MIDX in preparation for that. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Taylor Blau	7240cc4b65	midx: don't free midx_name early A subsequent patch will need to refer back to 'midx_name' later on in the function. In fact, this variable is already free()'d later on, so this makes the later free() no longer redundant. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Taylor Blau	9218c6a40c	midx: allow marking a pack as preferred When multiple packs in the multi-pack index contain the same object, the MIDX machinery must make a choice about which pack it associates with that object. Prior to this patch, the lowest-ordered[1] pack was always selected. Pack selection for duplicate objects is relatively unimportant today, but it will become important for multi-pack bitmaps. This is because we can only invoke the pack-reuse mechanism when all of the bits for reused objects come from the reuse pack (in order to ensure that all reused deltas can find their base objects in the same pack). To encourage the pack selection process to prefer one pack over another (the pack to be preferred is the one a caller would like to later use as a reuse pack), introduce the concept of a "preferred pack". When provided, the MIDX code will always prefer an object found in a preferred pack over any other. No format changes are required to store the preferred pack, since it will be able to be inferred with a corresponding MIDX bitmap, by looking up the pack associated with the object in the first bit position (this ordering is described in detail in a subsequent commit). [1]: the ordering is specified by MIDX internals; for our purposes we can consider the "lowest ordered" pack to be "the one with the most-recent mtime. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Li Linchao	4fe788b1b0	builtin/clone.c: add --reject-shallow option In some scenarios, users may want more history than the repository offered for cloning, which happens to be a shallow repository, can give them. But because users don't know it is a shallow repository until they download it to local, we may want to refuse to clone this kind of repository, without creating any unnecessary files. The '--depth=x' option cannot be used as a solution; the source may be deep enough to give us 'x' commits when cloned, but the user may later need to deepen the history to arbitrary depth. Teach '--reject-shallow' option to "git clone" to abort as soon as we find out that we are cloning from a shallow repository. Signed-off-by: Li Linchao <lilinchao@oschina.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 12:58:58 -07:00
Jeff King	c685450880	ref-filter: fix NULL check for parse object failure After we run parse_object_buffer() to get an object's contents, we try to check that the return value wasn't NULL. However, since our "struct object" is a pointer-to-pointer, and we assign like: *obj = parse_object_buffer(...); it's not correct to check: if (!obj) That will always be true, since our double pointer will continue to point to the single pointer (which is itself NULL). This is a regression that was introduced by `aa46a0da30` (ref-filter: use oid_object_info() to get object, 2018-07-17); since that commit we'll segfault on a parse failure, as we try to look at the NULL object pointer. There are many ways a parse could fail, but most of them are hard to set up in the tests (it's easy to make a bogus object, but update-ref will refuse to point to it). The test here uses a tag which points to a wrong object type. A parse of just the broken tag object will succeed, but seeing both tag objects in the same process will lead to a parse error (since we'll see the pointed-to object as both types). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 12:54:21 -07:00
Taylor Blau	3f267a1128	builtin/pack-objects.c: respect 'pack.preferBitmapTips' When writing a new pack with a bitmap, it is sometimes convenient to indicate some reference prefixes which should receive priority when selecting which commits to receive bitmaps. A truly motivated caller could accomplish this by setting 'pack.islandCore', (since all commits in the core island are similarly marked as preferred) but this requires callers to opt into using delta islands, which they may or may not want to do. Introduce a new multi-valued configuration, 'pack.preferBitmapTips' to allow callers to specify a list of reference prefixes. All references which have a prefix contained in 'pack.preferBitmapTips' will mark their tips as "preferred" in the same way as commits are marked as preferred for selection by 'pack.islandCore'. The choice of the verb "prefer" is intentional: marking the NEEDS_BITMAP flag on an object does not guarantee that that object will receive a bitmap. It merely guarantees that that commit will receive a bitmap over any other commit in the same window by bitmap_writer_select_commits(). The test this patch adds reflects this quirk, too. It only tests that a commit (which didn't receive bitmaps by default) is selected for bitmaps after changing the value of 'pack.preferBitmapTips' to include it. Other commits may lose their bitmaps as a byproduct of how the selection process works (bitmap_writer_select_commits() ignores the remainder of a window after seeing a commit with the NEEDS_BITMAP flag). This configuration will aide in selecting important references for multi-pack bitmaps, since they do not respect the same pack.islandCore configuration. (They could, but doing so may be confusing, since it is packs--not bitmaps--which are influenced by the delta-islands configuration). In a fork network repository (one which lists all forks of a given repository as remotes), for example, it is useful to set pack.preferBitmapTips to 'refs/remotes/<root>/heads' and 'refs/remotes/<root>/tags', where '<root>' is an opaque identifier referring to the repository which is at the base of the fork chain. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-31 23:14:03 -07:00
Taylor Blau	483fa7f42d	t/helper/test-bitmap.c: initial commit Add a new 'bitmap' test-tool which can be used to list the commits that have received bitmaps. In theory, a determined tester could run 'git rev-list --test-bitmap <commit>' to check if '<commit>' received a bitmap or not, since '--test-bitmap' exits with a non-zero code when it can't find the requested commit. But this is a dubious behavior to rely on, since arguably 'git rev-list' could continue its object walk outside of which commits are covered by bitmaps. This will be used to test the behavior of 'pack.preferBitmapTips', which will be added in the following patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-31 23:14:03 -07:00
Taylor Blau	dff5e49e51	pack-bitmap: add 'test_bitmap_commits()' helper The next patch will add a 'bitmap' test-tool which prints the list of commits that have bitmaps computed. The test helper could implement this itself, but it would need access to the 'bitmaps' field of the 'pack_bitmap' struct. To avoid exposing this private detail, implement the entirety of the helper behind a test_bitmap_commits() function in pack-bitmap.c. There is some precedence for this with test_bitmap_walk() which is used to implement the '--test-bitmap' flag in 'git rev-list' (and is also implemented in pack-bitmap.c). A caller will be added in the next patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-31 23:14:03 -07:00
Elijah Newren	39edfd5cbc	sequencer: fix edit handling for cherry-pick and revert messages save_opts() should save any non-default values. It was intended to do this, but since most options in struct replay_opts default to 0, it only saved non-zero values. Unfortunately, this does not always work for options.edit. Roughly speaking, options.edit had a default value of 0 for cherry-pick but a default value of 1 for revert. Make save_opts() record a value whenever it differs from the default. options.edit was also overly simplistic; we had more than two cases. The behavior that previously existed was as follows: Non-conflict commits Right after Conflict revert Edit iff isatty(0) Edit (ignore isatty(0)) cherry-pick No edit See above Specify --edit Edit (ignore isatty(0)) See above Specify --no-edit () See above () Before stopping for conflicts, No edit is the behavior. After stopping for conflicts, the --no-edit flag is not saved so see the first two rows. However, the expected behavior is: Non-conflict commits Right after Conflict revert Edit iff isatty(0) Edit iff isatty(0) cherry-pick No edit Edit iff isatty(0) Specify --edit Edit (ignore isatty(0)) Edit (ignore isatty(0)) Specify --no-edit No edit No edit In order to get the expected behavior, we need to change options.edit to a tri-state: unspecified, false, or true. When specified, we follow what it says. When unspecified, we need to check whether the current commit being created is resolving a conflict as well as consulting options.action and isatty(0). While at it, add a should_edit() utility function that compresses options.edit down to a boolean based on the additional information for the non-conflict case. continue_single_pick() is the function responsible for resuming after conflict cases, regardless of whether there is one commit being picked or many. Make this function stop assuming edit behavior in all cases, so that it can correctly handle !isatty(0) and specific requests to not edit the commit message. Reported-by: Renato Botelho <garga@freebsd.org> Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-31 14:10:50 -07:00
Junio C Hamano	a65ce7f831	The fifth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 14:35:38 -07:00
Junio C Hamano	5c2f7ff018	Merge branch 'jc/doc-format-patch-clarify' Explain pieces of the format-patch output upfront before the rest of the documentation starts referring to them. * jc/doc-format-patch-clarify: format-patch: give an overview of what a "patch" message is	2021-03-30 14:35:38 -07:00
Junio C Hamano	7652ce966f	Merge branch 'ab/detox-gettext-tests' Testfix. * ab/detox-gettext-tests: mktag tests: fix broken "&&" chain	2021-03-30 14:35:38 -07:00
Junio C Hamano	4730c5e273	Merge branch 'hx/pack-objects-chunk-comment' Comment update. * hx/pack-objects-chunk-comment: pack-objects: fix comment of reused_chunk.difference	2021-03-30 14:35:37 -07:00
Junio C Hamano	1ba947cf15	Merge branch 'rf/send-email-hookspath' "git send-email" learned to honor the core.hooksPath configuration. * rf/send-email-hookspath: git-send-email: Respect core.hooksPath setting	2021-03-30 14:35:37 -07:00
Junio C Hamano	dc2a073036	Merge branch 'ab/remove-rebase-usebuiltin' Remove the final hint that we used to have a scripted "git rebase". * ab/remove-rebase-usebuiltin: rebase: remove transitory rebase.useBuiltin setting & env	2021-03-30 14:35:37 -07:00
Junio C Hamano	5013802862	Merge branch 'cs/http-use-basic-after-failed-negotiate' When accessing a server with a URL like https://user:pass@site/, we did not to fall back to the basic authentication with the credential material embedded in the URL after the "Negotiate" authentication failed. Now we do. * cs/http-use-basic-after-failed-negotiate: remote-curl: fall back to basic auth if Negotiate fails	2021-03-30 14:35:37 -07:00
Junio C Hamano	b2309ad822	Merge branch 'ab/diff-no-index-tests' More test coverage over "diff --no-index". * ab/diff-no-index-tests: diff --no-index tests: test mode normalization diff --no-index tests: add test for --exit-code	2021-03-30 14:35:37 -07:00
Junio C Hamano	ad16f748f2	Merge branch 'ab/read-tree' Code simplification by removing support for a caller that is long gone. * ab/read-tree: tree.h API: simplify read_tree_recursive() signature tree.h API: expose read_tree_1() as read_tree_at() archive: stop passing "stage" through read_tree_recursive() ls-files: refactor away read_tree() ls-files: don't needlessly pass around stage variable tree.c API: move read_tree() into builtin/ls-files.c ls-files tests: add meaningful --with-tree tests show tests: add test for "git show <tree>"	2021-03-30 14:35:37 -07:00
Junio C Hamano	aab55b1d6e	Merge branch 'bs/asciidoctor-installation-hints' Doc update. * bs/asciidoctor-installation-hints: INSTALL: note on using Asciidoctor to build doc	2021-03-30 14:35:36 -07:00
Junio C Hamano	9210c68d2a	Merge branch 'mt/checkout-remove-nofollow' When "git checkout" removes a path that does not exist in the commit it is checking out, it wasn't careful enough not to follow symbolic links, which has been corrected. * mt/checkout-remove-nofollow: checkout: don't follow symlinks when removing entries symlinks: update comment on threaded_check_leading_path()	2021-03-30 14:35:36 -07:00
Derrick Stolee	c9e40ae8ec	p2000: add sparse-index repos p2000-sparse-operations.sh compares different Git commands in repositories with many files at HEAD but using sparse-checkout to focus on a small portion of those files. Add extra copies of the repository that use the sparse-index format so we can track how that affects the performance of different commands. At this point in time, the sparse-index is 100% overhead from the CPU front, and this is measurable in these tests: Test --------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.59(0.51+0.12) 2000.3: git status (full-index-v4) 0.59(0.52+0.11) 2000.4: git status (sparse-index-v3) 1.40(1.32+0.12) 2000.5: git status (sparse-index-v4) 1.41(1.36+0.08) 2000.6: git add -A (full-index-v3) 2.32(1.97+0.19) 2000.7: git add -A (full-index-v4) 2.17(1.92+0.14) 2000.8: git add -A (sparse-index-v3) 2.31(2.21+0.15) 2000.9: git add -A (sparse-index-v4) 2.30(2.20+0.13) 2000.10: git add . (full-index-v3) 2.39(2.02+0.20) 2000.11: git add . (full-index-v4) 2.20(1.94+0.16) 2000.12: git add . (sparse-index-v3) 2.36(2.27+0.12) 2000.13: git add . (sparse-index-v4) 2.33(2.21+0.16) 2000.14: git commit -a -m A (full-index-v3) 2.47(2.12+0.20) 2000.15: git commit -a -m A (full-index-v4) 2.26(2.00+0.17) 2000.16: git commit -a -m A (sparse-index-v3) 3.01(2.92+0.16) 2000.17: git commit -a -m A (sparse-index-v4) 3.01(2.94+0.15) Note that there is very little difference between the v3 and v4 index formats when the sparse-index is enabled. This is primarily due to the fact that the relative file sizes are the same, and the command time is mostly taken up by parsing tree objects to expand the sparse index into a full one. With the current file layout, the index file sizes are given by this table: \| full index \| sparse index \| +-------------+--------------+ v3 \| 108 MiB \| 1.6 MiB \| v4 \| 80 MiB \| 1.2 MiB \| Future updates will improve the performance of Git commands when the index is sparse. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:49 -07:00
Derrick Stolee	9ad2d5ea71	sparse-index: loose integration with cache_tree_verify() The cache_tree_verify() method is run when GIT_TEST_CHECK_CACHE_TREE is enabled, which it is by default in the test suite. The logic must be adjusted for the presence of these directory entries. For now, leave the test as a simple check for whether the directory entry is sparse. Do not go any further until needed. This allows us to re-enable GIT_TEST_CHECK_CACHE_TREE in t1092-sparse-checkout-compatibility.sh. Further, p2000-sparse-operations.sh uses the test suite and hence this is enabled for all tests. We need to integrate with it before we run our performance tests with a sparse-index. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:48 -07:00
Derrick Stolee	2de37c536d	cache-tree: integrate with sparse directory entries The cache-tree extension was previously disabled with sparse indexes. However, the cache-tree is an important performance feature for commands like 'git status' and 'git add'. Integrate it with sparse directory entries. When writing a sparse index, completely clear and recalculate the cache tree. By starting from scratch, the only integration necessary is to check if we hit a sparse directory entry and create a leaf of the cache-tree that has an entry_count of one and no subtrees. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:48 -07:00
Derrick Stolee	dcc5fd5fd2	sparse-checkout: disable sparse-index We use 'git sparse-checkout init --cone --sparse-index' to toggle the sparse-index feature. It makes sense to also disable it when running 'git sparse-checkout disable'. This is particularly important because it removes the extensions.sparseIndex config option, allowing other tools to use this Git repository again. This does mean that 'git sparse-checkout init' will not re-enable the sparse-index feature, even if it was previously enabled. While testing this feature, I noticed that the sparse-index was not being written on the first run, but by a second. This was caught by the call to 'test-tool read-cache --table'. This requires adjusting some assignments to core_apply_sparse_checkout and pl.use_cone_patterns in the sparse_checkout_init() logic. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:48 -07:00
Derrick Stolee	122ba1f7b5	sparse-checkout: toggle sparse index from builtin The sparse index extension is used to signal that index writes should be in sparse mode. This was only updated using GIT_TEST_SPARSE_INDEX=1. Add a '--[no-]sparse-index' option to 'git sparse-checkout init' that specifies if the sparse index should be used. It also updates the index to use the correct format, either way. Add a warning in the documentation that the use of a repository extension might reduce compatibility with third-party tools. 'git sparse-checkout init' already sets extension.worktreeConfig, which places most sparse-checkout users outside of the scope of most third-party tools. Update t1092-sparse-checkout-compatibility.sh to use this CLI instead of GIT_TEST_SPARSE_INDEX=1. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:48 -07:00
Derrick Stolee	58300f4743	sparse-index: add index.sparse config option When enabled, this config option signals that index writes should attempt to use sparse-directory entries. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:47 -07:00
Derrick Stolee	0938e6ff55	sparse-index: check index conversion happens Add a test case that uses test_region to ensure that we are truly expanding a sparse index to a full one, then converting back to sparse when writing the index. As we integrate more Git commands with the sparse index, we will convert these commands to check that we do _not_ convert the sparse index to a full index and instead stay sparse the entire time. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:47 -07:00
Derrick Stolee	13e1331247	unpack-trees: allow sparse directories The index_pos_by_traverse_info() currently throws a BUG() when a directory entry exists exactly in the index. We need to consider that it is possible to have a directory in a sparse index as long as that entry is itself marked with the skip-worktree bit. The 'pos' variable is assigned a negative value if an exact match is not found. Since a directory name can be an exact match, it is no longer an error to have a nonnegative 'pos' value. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:47 -07:00
Derrick Stolee	f442313e2e	submodule: sparse-index should not collapse links A submodule is stored as a "Git link" that actually points to a commit within a submodule. Submodules are populated or not depending on submodule configuration, not sparse-checkout. To ensure that the sparse-index feature integrates correctly with submodules, we should not collapse a directory if there is a Git link within its range. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:47 -07:00
Derrick Stolee	6e773527b6	sparse-index: convert from full to sparse If we have a full index, then we can convert it to a sparse index by replacing directories outside of the sparse cone with sparse directory entries. The convert_to_sparse() method does this, when the situation is appropriate. For now, we avoid converting the index to a sparse index if: 1. the index is split. 2. the index is already sparse. 3. sparse-checkout is disabled. 4. sparse-checkout does not use cone mode. Finally, we currently limit the conversion to when the GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git config will be added in a later change. The trickiest thing about this conversion is that we might not be able to mark a directory as a sparse directory just because it is outside the sparse cone. There might be unmerged files within that directory, so we need to look for those. Also, if there is some strange reason why a file is not marked with CE_SKIP_WORKTREE, then we should give up on converting that directory. There is still hope that some of its subdirectories might be able to convert to sparse, so we keep looking deeper. The conversion process is assisted by the cache-tree extension. This is calculated from the full index if it does not already exist. We then abandon the cache-tree as it no longer applies to the newly-sparse index. Thus, this cache-tree will be recalculated in every sparse-full-sparse round-trip until we integrate the cache-tree extension with the sparse index. Some Git commands use the index after writing it. For example, 'git add' will update the index, then write it to disk, then read its entries to report information. To keep the in-memory index in a full state after writing, we re-expand it to a full one after the write. This is wasteful for commands that only write the index and do not read from it again, but that is only the case until we make those commands "sparse aware." We can compare the behavior of the sparse-index in t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1 when operating on the 'sparse-index' repo. We can also compare the two sparse repos directly, such as comparing their indexes (when expanded to full in the case of the 'sparse-index' repo). We also verify that the index is actually populated with sparse directory entries. The 'checkout and reset (mixed)' test is marked for failure when comparing a sparse repo to a full repo, but we can compare the two sparse-checkout cases directly to ensure that we are not changing the behavior when using a sparse index. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:47 -07:00
Derrick Stolee	cd42415fb4	sparse-index: add 'sdir' index extension The index format does not currently allow for sparse directory entries. This violates some expectations that older versions of Git or third-party tools might not understand. We need an indicator inside the index file to warn these tools to not interact with a sparse index unless they are aware of sparse directory entries. Add a new _required_ index extension, 'sdir', that indicates that the index may contain sparse directory entries. This allows us to continue to use the differences in index formats 2, 3, and 4 before we create a new index version 5 in a later change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:46 -07:00
Derrick Stolee	836e25c51b	sparse-checkout: hold pattern list in index As we modify the sparse-checkout definition, we perform index operations on a pattern_list that only exists in-memory. This allows easy backing out in case the index update fails. However, if the index write itself cares about the sparse-checkout pattern set, we need access to that in-memory copy. Place a pointer to a 'struct pattern_list' in the index so we can access this on-demand. This will be used in the next change which uses the sparse-checkout definition to filter out directories that are outside the sparse cone. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:46 -07:00
Derrick Stolee	6863df3550	unpack-trees: ensure full index The next change will translate full indexes into sparse indexes at write time. The existing logic provides a way for every sparse index to be expanded to a full index at read time. However, there are cases where an index is written and then continues to be used in-memory to perform further updates. unpack_trees() is frequently called after such a write. In particular, commands like 'git reset' do this double-update of the index. Ensure that we have a full index when entering unpack_trees(), but only when command_requires_full_index is true. This is always true at the moment, but we will later relax that after unpack_trees() is updated to handle sparse directory entries. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:46 -07:00
Derrick Stolee	2782db3eed	test-tool: don't force full index We will use 'test-tool read-cache --table' to check that a sparse index is written as part of init_repos. Since we will no longer always expand a sparse index into a full index, add an '--expand' parameter that adds a call to ensure_full_index() so we can compare a sparse index directly against a full index, or at least what the in-memory index looks like when expanded in this way. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:46 -07:00
Derrick Stolee	e2df6c3972	test-read-cache: print cache entries with --table This table is helpful for discovering data in the index to ensure it is being written correctly, especially as we build and test the sparse-index. This table includes an output format similar to 'git ls-tree', but should not be compared to that directly. The biggest reasons are that 'git ls-tree' includes a tree entry for every subdirectory, even those that would not appear as a sparse directory in a sparse-index. Further, 'git ls-tree' does not use a trailing directory separator for its tree rows. This does not print the stat() information for the blobs. That will be added in a future change with another option. The tests that are added in the next few changes care only about the object types and IDs. However, this future need for full index information justifies the need for this test helper over extending a user-facing feature, such as 'git ls-files'. To make the option parsing slightly more robust, wrap the string comparisons in a loop adapted from test-dir-iterator.c. Care must be taken with the final check for the 'cnt' variable. We continue the expectation that the numerical value is the final argument. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:46 -07:00
Derrick Stolee	ecfc47c066	t1092: compare sparse-checkout to sparse-index Add a new 'sparse-index' repo alongside the 'full-checkout' and 'sparse-checkout' repos in t1092-sparse-checkout-compatibility.sh. Also add run_on_sparse and test_sparse_match helpers. These helpers will be used when the sparse index is implemented. Add the GIT_TEST_SPARSE_INDEX environment variable to enable the sparse-index by default. This can be enabled across all tests, but that will only affect cases where the sparse-checkout feature is enabled. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:45 -07:00
Derrick Stolee	4300f8442a	sparse-index: implement ensure_full_index() We will mark an in-memory index_state as having sparse directory entries with the sparse_index bit. These currently cannot exist, but we will add a mechanism for collapsing a full index to a sparse one in a later change. That will happen at write time, so we must first allow parsing the format before writing it. Commands or methods that require a full index in order to operate can call ensure_full_index() to expand that index in-memory. This requires parsing trees using that index's repository. Sparse directory entries have a specific 'ce_mode' value. The macro S_ISSPARSEDIR(ce->ce_mode) can check if a cache_entry 'ce' has this type. This ce_mode is not possible with the existing index formats, so we don't also verify all properties of a sparse-directory entry, which are: 1. ce->ce_mode == 0040000 2. ce->flags & CE_SKIP_WORKTREE is true 3. ce->name[ce->namelen - 1] == '/' (ends in dir separator) 4. ce->oid references a tree object. These are all semi-enforced in ensure_full_index() to some extent. Any deviation will cause a warning at minimum or a failure in the worst case. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:45 -07:00
Derrick Stolee	3964fc2aae	sparse-index: add guard to ensure full index Upcoming changes will introduce modifications to the index format that allow sparse directories. It will be useful to have a mechanism for converting those sparse index files into full indexes by walking the tree at those sparse directories. Name this method ensure_full_index() as it will guarantee that the index is fully expanded. This method is not implemented yet, and instead we focus on the scaffolding to declare it and call it at the appropriate time. Add a 'command_requires_full_index' member to struct repo_settings. This will be an indicator that we need the index in full mode to do certain index operations. This starts as being true for every command, then we will set it to false as some commands integrate with sparse indexes. If 'command_requires_full_index' is true, then we will immediately expand a sparse index to a full one upon reading from disk. This suffices for now, but we will want to add more callers to ensure_full_index() later. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:45 -07:00
Derrick Stolee	4b3f765a2f	t1092: clean up script quoting This test was introduced in `19a0acc83e` (t1092: test interesting sparse-checkout scenarios, 2021-01-23), but it contains issues with quoting that were not noticed until starting this follow-up series. The old mechanism would drop quoting such as in test_all_match git commit -m "touch README.md" The above happened to work because README.md is a file in the repository, so 'git commit -m touch REAMDE.md' would succeed by accident. Other cases included quoting for no good reason, so clean that up now. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:45 -07:00
Derrick Stolee	0b5fcb08b5	t/perf: add performance test for sparse operations Create a test script that takes the default performance test (the Git codebase) and multiplies it by 256 using four layers of duplicated trees of width four. This results in nearly one million blob entries in the index. Then, we can clone this repository with sparse-checkout patterns that demonstrate four copies of the initial repository. Each clone will use a different index format or mode so peformance can be tested across the different options. Note that the initial repo is stripped of submodules before doing the copies. This preserves the expected data shape of the sparse index, because directories containing submodules are not collapsed to a sparse directory entry. Run a few Git commands on these clones, especially those that use the index (status, add, commit). Here are the results on my Linux machine: Test -------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.37(0.30+0.09) 2000.3: git status (full-index-v4) 0.39(0.32+0.10) 2000.4: git add -A (full-index-v3) 1.42(1.06+0.20) 2000.5: git add -A (full-index-v4) 1.26(0.98+0.16) 2000.6: git add . (full-index-v3) 1.40(1.04+0.18) 2000.7: git add . (full-index-v4) 1.26(0.98+0.17) 2000.8: git commit -a -m A (full-index-v3) 1.42(1.11+0.16) 2000.9: git commit -a -m A (full-index-v4) 1.33(1.08+0.16) It is perhaps noteworthy that there is an improvement when using index version 4. This is because the v3 index uses 108 MiB while the v4 index uses 80 MiB. Since the repeated portions of the directories are very short (f3/f1/f2, for example) this ratio is less pronounced than in similarly-sized real repositories. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:44 -07:00
Derrick Stolee	0ad6090bdd	sparse-index: design doc and format update This begins a long effort to update the index format to allow sparse directory entries. This should result in a significant improvement to Git commands when HEAD contains millions of files, but the user has selected many fewer files to keep in their sparse-checkout definition. Currently, the index format is only updated in the presence of extensions.sparseIndex instead of increasing a file format version number. This is temporary, and index v5 is part of the plan for future work in this area. The design document details many of the reasons for embarking on this work, and also the plan for completing it safely. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:44 -07:00
Taylor Blau	86d174b724	t/helper/test-read-midx.c: add '--show-objects' The 'read-midx' helper is used in places like t5319 to display basic information about a multi-pack-index. In the next patch, the MIDX writing machinery will learn a new way to choose from which pack an object is selected when multiple copies of that object exist. To disambiguate which pack introduces an object so that this feature can be tested, add a '--show-objects' option which displays additional information about each object in the MIDX. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:16:56 -07:00
Taylor Blau	cd57bc41bb	builtin/multi-pack-index.c: display usage on unrecognized command When given a sub-command that it doesn't understand, 'git multi-pack-index' dies with the following message: $ git multi-pack-index bogus fatal: unrecognized subcommand: bogus Instead of 'die()'-ing, we can display the usage text, which is much more helpful: $ git.compile multi-pack-index bogus error: unrecognized subcommand: bogus usage: git multi-pack-index [<options>] write or: git multi-pack-index [<options>] verify or: git multi-pack-index [<options>] expire or: git multi-pack-index [<options>] repack [--batch-size=<size>] --object-dir <file> object directory containing set of packfile and pack-index pairs --progress force progress reporting While we're at it, clean up some duplication between the "no sub-command" and "unrecognized sub-command" conditionals. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:16:56 -07:00
Taylor Blau	690eb05719	builtin/multi-pack-index.c: don't enter bogus cmd_mode Even before the recent refactoring, 'git multi-pack-index' calls 'trace2_cmd_mode()' before verifying that the sub-command is recognized. Push this call down into the individual sub-commands so that we don't enter a bogus command mode. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:16:56 -07:00
Taylor Blau	60ca94769c	builtin/multi-pack-index.c: split sub-commands Handle sub-commands of the 'git multi-pack-index' builtin (e.g., "write", "repack", etc.) separately from one another. This allows sub-commands with unique options, without forcing cmd_multi_pack_index() to reject invalid combinations itself. This comes at the cost of some duplication and boilerplate. Luckily, the duplication is reduced to a minimum, since common options are shared among sub-commands due to a suggestion by Ævar. (Sub-commands do have to retain the common options, too, since this builtin accepts common options on either side of the sub-command). Roughly speaking, cmd_multi_pack_index() parses options (including common ones), and stops at the first non-option, which is the sub-command. It then dispatches to the appropriate sub-command, which parses the remaining options (also including common options). Unknown options are kept by the sub-commands in order to detect their presence (and complain that too many arguments were given). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:16:56 -07:00
Taylor Blau	b25b727494	builtin/multi-pack-index.c: define common usage with a macro Factor out the usage message into pieces corresponding to each mode. This avoids options specific to one sub-command from being shared with another in the usage. A subsequent commit will use these #define macros to have usage variables for each sub-command without duplicating their contents. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:16:56 -07:00
Taylor Blau	cf1f5389ec	builtin/multi-pack-index.c: don't handle 'progress' separately Now that there is a shared 'flags' member in the options structure, there is no need to keep track of whether to force progress or not, since ultimately the decision of whether or not to show a progress meter is controlled by a bit in the flags member. Manipulate that bit directly, and drop the now-unnecessary 'progress' field while we're at it. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:16:56 -07:00
Taylor Blau	f7c4d63e35	builtin/multi-pack-index.c: inline 'flags' with options Subcommands of the 'git multi-pack-index' command (e.g., 'write', 'verify', etc.) will want to optionally change a set of shared flags that are eventually passed to the MIDX libraries. Right now, options and flags are handled separately. That's fine, since the options structure is never passed around. But a future patch will make it so that common options shared by all sub-commands are defined in a common location. That means that "flags" would have to become a global variable. Group it with the options structure so that we reduce the number of global variables we have overall. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:16:56 -07:00
Chinmoy Chakraborty	5ee90326dc	column, range-diff: downcase option description It is customary not to begin the help text for each option given to the parse-options API with a capital letter. Various (sub)commands' option arrays don't follow the guideline provided by the parse_options Documentation regarding the descriptions. Downcase the first word of some option descriptions for "column" and "range-diff". Signed-off-by: Chinmoy Chakraborty <chinmoy12c@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-29 14:06:08 -07:00
Dennis Ameling	958a5f5dfe	cmake(install): include vcpkg dlls Our CMake configuration generates not only build definitions, but also install definitions: After building Git using `msbuild git.sln`, the built artifacts can be installed via `msbuild INSTALL.vcxproj`. To specify _where_ the files should be installed, the `-DCMAKE_INSTALL_PREFIX=<path>` option can be used when running CMake. However, this process would really only install the files that were just built. On Windows, we need more than that: We also need the `.dll` files of the dependencies (such as libcurl). The `vcpkg` ecosystem, which we use to obtain those dependencies, can be asked to install said `.dll` files really easily, so let's do that. This requires more than just the built `vcpkg` artifacts in the CI build definition; We now clone the `vcpkg` repository so that the relevant CMake scripts are available, in particular the ones related to defining the toolchain. Signed-off-by: Dennis Ameling <dennis@dennisameling.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-29 13:49:04 -07:00
Johannes Schindelin	e8772a7af5	cmake: add a preparatory work-around to accommodate `vcpkg` We are about to add support for installing the `.dll` files of Git's dependencies (such as libcurl) in the CMake configuration. The `vcpkg` ecosystem from which we get said dependencies makes that relatively easy: simply turn on `X_VCPKG_APPLOCAL_DEPS_INSTALL`. However, current `vcpkg` introduces a limitation if one does that: While it is totally cool with CMake to specify multiple targets within one invocation of `install(TARGETS ...) (at least according to https://cmake.org/cmake/help/latest/command/install.html#command:install), `vcpkg`'s parser insists on a single target per `install(TARGETS ...)` invocation. Well, that's easily accomplished: Let's feed the targets individually to the `install(TARGETS ...)` function in a `foreach()` look. This also has the advantage that we do not have to manually cull off the two entries from the `${PROGRAMS_BUILT}` array before scheduling the remainder to be installed into `libexec/git-core`. Instead, we iterate through the array and decide for each entry where it wants to go. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-29 13:49:04 -07:00
Ævar Arnfjörð Bjarmason	3745e2693d	fetch-pack: use new fsck API to printing dangling submodules Refactor the check added in `5476e1efde` (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to make use of us now passing the "msg_id" to the user defined "error_func". We can now compare against the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated message. Let's also replace register_found_gitmodules() with directly manipulating the "gitmodules_found" member. A recent commit moved it into "fsck_options" so we could do this here. I'm sticking this callback in fsck.c. Perhaps in the future we'd like to accumulate such callbacks into another file (maybe fsck-cb.c, similar to parse-options-cb.c?), but while we've got just the one let's just put it into fsck.c. A better alternative in this case would be some library some more obvious library shared by fetch-pack.c ad builtin/index-pack.c, but there isn't such a thing. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	c96e184cae	fetch-pack: use file-scope static struct for fsck_options Change code added in `5476e1efde` (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so that we use a file-scoped "static struct fsck_options" instead of defining one in the "fsck_gitmodules_oids()" function. We use this pattern in all of builtin/{fsck,index-pack,mktag,unpack-objects}.c. It's odd to see fetch-pack be the odd one out. One might think that we're using other fsck_options structs in fetch-pack, or doing on fsck twice there, but we're not. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	462f5cae0f	fetch-pack: don't needlessly copy fsck_options Change the behavior of the .gitmodules validation added in `5476e1efde` (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so we're using one "fsck_options". I found that code confusing to read. One might think that not setting up the error_func earlier means that we're relying on the "error_func" not being set in some code in between the two hunks being modified here. But we're not, all we're doing in the rest of "cmd_index_pack()" is further setup by calling fsck_set_msg_types(), and assigning to do_fsck_object. So there was no reason in `5476e1efde` to make a shallow copy of the fsck_options struct before setting error_func. Let's just do this setup at the top of the function, along with the "walk" assignment. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	c15087d17b	fsck.c: move gitmodules_{found,done} into fsck_options Move the gitmodules_{found,done} static variables added in `159e7b080b` (fsck: detect gitmodules files, 2018-05-02) into the fsck_options struct. It makes sense to keep all the context in the same place. This requires changing the recently added register_found_gitmodules() function added in `5476e1efde` (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to take fsck_options. That function will be removed in a subsequent commit, but as it'll require the new gitmodules_found attribute of "fsck_options" we need this intermediate step first. An earlier version of this patch removed the small amount of duplication we now have between FSCK_OPTIONS_{DEFAULT,STRICT} with a FSCK_OPTIONS_COMMON macro. I don't think such de-duplication is worth it for this amount of copy/pasting. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	53692df2b8	fsck.c: add an fsck_set_msg_type() API that takes enums Change code I added in `acf9de4c94` (mktag: use fsck instead of custom verify_tag(), 2021-01-05) to make use of a new API function that takes the fsck_msg_{id,type} types, instead of arbitrary strings that we'll (hopefully) parse into those types. At the time that the fsck_set_msg_type() API was introduced in `0282f4dced` (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) it was only intended to be used to parse user-supplied data. For things that are purely internal to the C code it makes sense to have the compiler check these arguments, and to skip the sanity checking of the data in fsck_set_msg_type() which is redundant to checks we get from the compiler. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	394d5d31b0	fsck.c: pass along the fsck_msg_id in the fsck_error callback Change the fsck_error callback to also pass along the fsck_msg_id. Before this change the only way to get the message id was to parse it back out of the "message". Let's pass it down explicitly for the benefit of callers that might want to use it, as discussed in [1]. Passing the msg_type is now redundant, as you can always get it back from the msg_id, but I'm not changing that convention. It's really common to need the msg_type, and the report() function itself (which calls "fsck_error") needs to call fsck_msg_type() to discover it. Let's not needlessly re-do that work in the user callback. 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	44e07da8bb	fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from .c to .h Move the FOREACH_FSCK_MSG_ID macro and the fsck_msg_id enum it helps define from fsck.c to fsck.h. This is in preparation for having non-static functions take the fsck_msg_id as an argument. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	901f2f6742	fsck.c: give "FOREACH_MSG_ID" a more specific name Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation for moving it over to fsck.h. It's good convention to name macros in *.h files in such a way as to clearly not clash with any other names in other files. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	b5495024ec	fsck.c: undefine temporary STR macro after use In `f417eed8cd` (fsck: provide a function to parse fsck message IDs, 2015-06-22) the "STR" macro was introduced, but that short macro name was not undefined after use as was done earlier in the same series for the MSG_ID macro in `c99ba492f1` (fsck: introduce identifiers for fsck messages, 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	c72da1a22b	fsck.c: call parse_msg_type() early in fsck_set_msg_type() There's no reason to defer the calling of parse_msg_type() until after we've checked if the "id < 0". This is not a hot codepath, and parse_msg_type() itself may die on invalid input. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	30cf618eef	fsck.h: re-order and re-assign "enum fsck_msg_type" Change the values in the "enum fsck_msg_type" from being manually assigned to using default C enum values. This means we end up with a FSCK_IGNORE=0, which was previously defined as "2". I'm confident that nothing relies on these values, we always compare them for equality. Let's not omit "0" so it won't be assumed that we're using these as a boolean somewhere. This also allows us to re-structure the fields to mark which are "private" v.s. "public". See the preceding commit for a rationale for not simply splitting these into two enums, namely that this is used for both the private and public fsck API. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	1b32b59f9b	fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new fsck_msg_type enum. These defines were originally introduced in: - `ba002f3b28` (builtin-fsck: move common object checking code to fsck.c, 2008-02-25) - `f50c440730` (fsck: disallow demoting grave fsck errors to warnings, 2015-06-22) - `efaba7cc77` (fsck: optionally ignore specific fsck issues completely, 2015-06-22) - `f27d05b170` (fsck: allow upgrading fsck warnings to errors, 2015-06-22) The reason these were defined in two different places is because we use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are used by external callbacks. Untangling that would take some more work, since we expose the new "enum fsck_msg_type" to both. Similar to "enum object_type" it's not worth structuring the API in such a way that only those who need FSCK_{ERROR,WARN} pass around a different type. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	e35d65a78a	fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Refactor "if options->msg_type" and other code added in `0282f4dced` (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) to reduce the scope of the "int msg_type" variable. This is in preparation for changing its type in a subsequent commit, only using it in the "!options->msg_type" scope makes that change This also brings the code in line with the fsck_set_msg_type() function (also added in `0282f4dced`), which does a similar check for "!options->msg_type". Another minor benefit is getting rid of the style violation of not having braces for the body of the "if". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	35af754b06	fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Rename the remaining variables of type fsck_msg_id from "id" to "msg_id". This change is relatively small, and is worth the churn for a later change where we have different id's in the "report" function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	034a7b7bcc	fsck.c: remove (mostly) redundant append_msg_id() function Remove the append_msg_id() function in favor of calling prepare_msg_ids(). We already have code to compute the camel-cased msg_id strings in msg_id_info, let's use it. When the append_msg_id() function was added in `71ab8fa840` (fsck: report the ID of the error/warning, 2015-06-22) the prepare_msg_ids() function didn't exist. When prepare_msg_ids() was added in `a46baac61e` (fsck: factor out msg_id_info[] lazy initialization code, 2018-05-26) this code wasn't moved over to lazy initialization. This changes the behavior of the code to initialize all the messages instead of just camel-casing the one we need on the fly. Since the common case is that we're printing just one message this is mostly redundant work. But that's OK in this case, reporting this fsck issue to the user isn't performance-sensitive. If we were somehow doing so in a tight loop (in a hopelessly broken repository?) this would help, since we'd save ourselves from re-doing this work for identical messages, we could just grab the prepared string from msg_id_info after the first invocation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	f1abc2d0e1	fsck.c: rename variables in fsck_set_msg_type() for less confusion Rename variables in a function added in `0282f4dced` (fsck: offer a function to demote fsck errors to warnings, 2015-06-22). It was needlessly confusing that it took a "msg_type" argument, but then later declared another "msg_type" of a different type. Let's rename that to "severity", and rename "id" to "msg_id" and "msg_id" to "msg_id_str" etc. This will make a follow-up change smaller. While I'm at it properly indent the fsck_set_msg_type() argument list. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	a1aad71601	fsck.h: use "enum object_type" instead of "int" Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in `355885d531` (add generic, type aware object chain walker, 2008-02-25) we've used entries from object_type (OBJ_BLOB etc.). So this doesn't really change anything as far as the generated code is concerned, it just gives the compiler more information and makes this easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	d385784f89	fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Refactor the definitions of FSCK_OPTIONS_{DEFAULT,STRICT} to use designated initializers. This allows us to omit those fields that are initialized to 0 or NULL. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:02:59 -07:00
Dennis Ameling	569f8d188f	cmake(install): fix double .exe suffixes By mistake, the `.exe` extension is appended _twice_ when installing the dashed executables into `libexec/git-core/` on Windows (the extension is already appended when adding items to the `git_links` list in the `#Creating hardlinks` section). Signed-off-by: Dennis Ameling <dennis@dennisameling.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-27 18:02:23 -07:00
Johannes Schindelin	7bb544a4d1	cmake: support SKIP_DASHED_BUILT_INS Just like the Makefile-based build learned to skip hard-linking the dashed built-ins in `179227d6e2` (Optionally skip linking/copying the built-ins, 2020-09-21), this patch teaches the CMake-based build the same trick. Note: In contrast to the Makefile-based process, the built-ins would only be linked during installation, not already when Git is built. Therefore, the CMake-based build that we use in our CI builds _already_ does not link those built-ins (because the files are not installed anywhere, they are used to run the test suite in-place). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-27 18:02:23 -07:00
Johannes Schindelin	09420b7648	Document how we do embargoed releases Whenever we fix critical vulnerabilities, we follow some sort of protocol (e.g. setting a coordinated release date, keeping the fix under embargo until that time, coordinating with packagers and/or hosting sites, etc). Similar in spirit to `Documentation/howto/maintain-git.txt`, let's formalize the details in a document. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-27 15:13:12 -07:00
Johannes Schindelin	2e99b1e383	SECURITY: describe how to report vulnerabilities In the same document, describe that Git does not have Long Term Support (LTS) release trains, although security fixes are always applied to a few of the most recent release trains. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-27 15:13:02 -07:00
René Scharfe	9a7f1ce8b7	daemon: sanitize all directory separators When sanitizing client-supplied strings on Windows, also strip off backslashes, not just slashes. Signed-off-by: René Scharfe <l.s.r@web.de> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-26 22:00:12 -07:00
Junio C Hamano	84d06cdc06	Sync with v2.31.1	2021-03-26 14:59:47 -07:00
Junio C Hamano	26c4f98ffd	The fourth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-26 14:59:03 -07:00
Junio C Hamano	89519f662c	Merge branch 'cm/rebase-i-fixup-amend-reword' "git commit --fixup=<commit>", which was to tweak the changes made to the contents while keeping the original log message intact, learned "--fixup=(amend\|reword):<commit>", that can be used to tweak both the message and the contents, and only the message, respectively. * cm/rebase-i-fixup-amend-reword: doc/git-commit: add documentation for fixup=[amend\|reword] options t3437: use --fixup with options to create amend! commit t7500: add tests for --fixup=[amend\|reword] options commit: add a reword suboption to --fixup commit: add amend suboption to --fixup to create amend! commit sequencer: export and rename subject_length()	2021-03-26 14:59:03 -07:00
Junio C Hamano	fde07fc356	Merge branch 'cm/rebase-i-updates' Follow-up fixes to "cm/rebase-i" topic. * cm/rebase-i-updates: doc/rebase -i: fix typo in the documentation of 'fixup' command t/t3437: fixup the test 'multiple fixup -c opens editor once' t/t3437: use named commits in the tests t/t3437: simplify and document the test helpers t/t3437: check the author date of fixed up commit t/t3437: remove the dependency of 'expected-message' file from tests t/t3437: fixup here-docs in the 'setup' test t/lib-rebase: update the documentation of FAKE_LINES rebase -i: clarify and fix 'fixup -c' rebase-todo help sequencer: rename a few functions sequencer: fixup the datatype of the 'flag' argument	2021-03-26 14:59:03 -07:00
Junio C Hamano	ce4296cf2b	Merge branch 'cm/rebase-i' "rebase -i" is getting cleaned up and also enhanced. * cm/rebase-i: doc/git-rebase: add documentation for fixup [-C\|-c] options rebase -i: teach --autosquash to work with amend! t3437: test script for fixup [-C\|-c] options in interactive rebase rebase -i: add fixup [-C \| -c] command sequencer: use const variable for commit message comments sequencer: pass todo_item to do_pick_commit() rebase -i: comment out squash!/fixup! subjects from squash message sequencer: factor out code to append squash message rebase -i: only write fixup-message when it's needed	2021-03-26 14:59:03 -07:00
Junio C Hamano	8c81fce4b0	Merge branch 'js/http-pki-credential-store' The http codepath learned to let the credential layer to cache the password used to unlock a certificate that has successfully been used. * js/http-pki-credential-store: http: drop the check for an empty proxy password before approving http: store credential when PKI auth is used	2021-03-26 14:59:02 -07:00
Junio C Hamano	ed953e1076	Merge branch 'ab/make-cleanup' Reorganize Makefile to allow building git.o and other essential objects without extra stuff needed only for testing. * ab/make-cleanup: Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets Makefile: split OBJECTS into OBJECTS and GIT_OBJS Makefile: sort OBJECTS assignment for subsequent change Makefile: split up long OBJECTS line Makefile: guard against TEST_OBJS in the environment	2021-03-26 14:59:02 -07:00
Junio C Hamano	48bf2fa8ba	Git 2.31.1 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-26 14:49:41 -07:00
Derrick Stolee	ddaf1f62e3	csum-file: make hashwrite() more readable The hashwrite() method takes an input buffer and updates a hashfile's hash function while writing the data to a file. To avoid overuse of flushes, the hashfile has an internal buffer and most writes will use memcpy() to transfer data from the input 'buf' to the hashfile's buffer of size 8 * 1024 bytes. Logic introduced by `a8032d12` (sha1write: don't copy full sized buffers, 2008-09-02) reduces the number of memcpy() calls when the input buffer is sufficiently longer than the hashfile's buffer, causing nr to be the length of the full buffer. In these cases, the input buffer is used directly in chunks equal to the hashfile's buffer size. This method caught my attention while investigating some performance issues, but it turns out that these performance issues were noise within the variance of the experiment. However, during this investigation, I inspected hashwrite() and misunderstood it, even after looking closely and trying to make it faster. This change simply reorganizes some parts of the loop within hashwrite() to make it clear that each batch either uses memcpy() to the hashfile's buffer or writes directly from the input buffer. The previous code relied on indirection through local variables and essentially inlined the implementation of hashflush() to reduce lines of code. Helped-by: Jeff King <peff@peff.net> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-26 14:32:45 -07:00
Junio C Hamano	9198c13e34	The third patch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-24 14:36:27 -07:00
Junio C Hamano	858119f6d7	Merge branch 'nk/diff-index-fsmonitor' "git diff-index" codepath has been taught to trust fsmonitor status to reduce number of lstat() calls. * nk/diff-index-fsmonitor: fsmonitor: add perf test for git diff HEAD fsmonitor: add assertion that fsmonitor is valid to check_removed fsmonitor: skip lstat deletion check during git diff-index	2021-03-24 14:36:27 -07:00
Junio C Hamano	e537784f64	Merge branch 'jk/fail-prereq-testfix' GIT_TEST_FAIL_PREREQS is a mechanism to skip test pieces with prerequisites to catch broken tests that depend on the side effects of optional pieces, but did not work at all when negative prerequisites were involved. * jk/fail-prereq-testfix: t: annotate !PTHREADS tests with !FAIL_PREREQS	2021-03-24 14:36:27 -07:00
Junio C Hamano	2744383cbd	Merge branch 'tb/geometric-repack' "git repack" so far has been only capable of repacking everything under the sun into a single pack (or split by size). A cleverer strategy to reduce the cost of repacking a repository has been introduced. * tb/geometric-repack: builtin/pack-objects.c: ignore missing links with --stdin-packs builtin/repack.c: reword comment around pack-objects flags builtin/repack.c: be more conservative with unsigned overflows builtin/repack.c: assign pack split later t7703: test --geometric repack with loose objects builtin/repack.c: do not repack single packs with --geometric builtin/repack.c: add '--geometric' option packfile: add kept-pack cache for find_kept_pack_entry() builtin/pack-objects.c: rewrite honor-pack-keep logic p5303: measure time to repack with keep p5303: add missing &&-chains builtin/pack-objects.c: add '--stdin-packs' option revision: learn '--no-kept-objects' packfile: introduce 'find_kept_pack_entry()'	2021-03-24 14:36:27 -07:00
Junio C Hamano	c6617d1e4f	Merge branch 'tb/push-simple-uses-branch-merge-config' Doc update. * tb/push-simple-uses-branch-merge-config: Documentation/git-push.txt: correct configuration typo	2021-03-24 14:36:27 -07:00
Han Xin	bf12013f1a	pack-objects: fix comment of reused_chunk.difference As record_reused_object(offset, offset - hashfile_total(out)) said, reused_chunk.difference should be the offset of original packfile minus the offset of the generated packfile. But the comment presented an opposite way. Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-24 13:03:22 -07:00
Junio C Hamano	28e29ee38b	format-patch: give an overview of what a "patch" message is The text says something called a "patch" is prepared one for each commit, it is suitable for e-mail submission, and "am" is the command to use it, but does not say what the "patch" really is. The description in the page also refers to the "three-dash" line, but it is unclear what it is, unless the reader is given a more detailed overview of what the "patch" is. Add a brief paragraph to give an overview of what the output looks like. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-24 12:14:23 -07:00
Denton Liu	6131807864	git-completion.bash: use __gitcomp_builtin() in _git_stash() The completion for 'git stash' has not changed in a major way since it was converted from shell script to builtin. Now that it's a builtin, we can take advantage of the groundwork laid out by parse-options and use the generated options. Rewrite _git_stash() to take use __gitcomp_builtin() to generate completions for subcommands. The main `git stash` command does not take any arguments directly. If no subcommand is given, it automatically defaults to `git stash push`. This means that we can simplify the logic for when no subcommands have been given yet. We only have to offer subcommand completions when we're completing a non-option after "stash". One area that this patch could improve upon is that the `git stash list` command accepts log-options. It would be nice if the completion for this were unified with that of _git_log() and _git_show() which would allow completions to be provided for options such as `--pretty` but that is outside the scope of this patch. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-24 10:05:47 -07:00
Denton Liu	42b30bcbb7	git-completion.bash: extract from else in _git_stash() To save a level of indentation, perform an early return in the "if" arm so we can move the "else" code out of the block. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-24 10:05:47 -07:00
Denton Liu	e94fb44042	git-completion.bash: pass $__git_subcommand_idx from __git_main() Many completion functions perform hardcoded comparisons with $cword. This fails in the case where the main git command is given arguments (e.g. `git -C . bundle<TAB>` would fail to complete its subcommands). Even _git_worktree(), which uses __git_find_on_cmdline(), could still fail. With something like `git -C add worktree move<TAB>`, the subcommand would be incorrectly identified as "add" instead of "move". Assign $__git_subcommand_idx in __git_main(), where the git subcommand is actually found and the corresponding completion function is called. Use this variable to replace hardcoded comparisons with $cword. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-24 10:05:47 -07:00
Ævar Arnfjörð Bjarmason	76593c09bb	mktag tests: fix broken "&&" chain Remove a stray "xb" I inadvertently introduced in `780aa0a21e` (tests: remove last uses of GIT_TEST_GETTEXT_POISON=false, 2021-02-11). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 22:14:28 -07:00
Robert Foss	c8243933c7	git-send-email: Respect core.hooksPath setting get-send-email currently makes the assumption that the 'sendemail-validate' hook exists inside of the repository. Since the introduction of 'core.hooksPath' configuration option in `867ad08a26` (hooks: allow customizing where the hook directory is, 2016-05-04), this is no longer true. Instead of assuming a hardcoded repo relative path, query git for the actual path of the hooks directory. Signed-off-by: Robert Foss <robert.foss@linaro.org> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 15:02:52 -07:00
Ævar Arnfjörð Bjarmason	9bcde4d531	rebase: remove transitory rebase.useBuiltin setting & env Remove the rebase.useBuiltin setting and the now-obsolete GIT_TEST_REBASE_USE_BUILTIN test flag. This was left in place after my `d03ebd411c` (rebase: remove the rebase.useBuiltin setting, 2019-03-18) to help anyone who'd used the experimental flag and wanted to know that it was the default, or that they should transition their test environment to use the builtin rebase unconditionally. It's been more than long enough for those users to get a headsup about this. So remove all the scaffolding that was left inplace after `d03ebd411c`. I'm also removing the documentation entry, if anyone still has this left in their configuration they can do some source archaeology to figure out what it used to do, which makes more sense than exposing every git user reading the documentation to this legacy configuration switch. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 14:05:58 -07:00
ZheNing Hu	db91988aa1	format-patch: allow a non-integral version numbers The `-v<n>` option of `format-patch` can give nothing but an integral iteration number to patches in a series. Some people, however, prefer to mark a new iteration with only a small fixup with a non integral iteration number (e.g. an "oops, that was wrong" fix-up patch for v4 iteration may be labeled as "v4.1"). Allow `format-patch` to take such a non-integral iteration number. `<n>` can be any string, such as '3.1' or '4rev2'. In the case where it is a non-integral value, the "Range-diff" and "Interdiff" headers will not include the previous version. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 12:49:47 -07:00
Matheus Tavares	ae22751f9b	entry: add checkout_entry_ca() taking preloaded conv_attrs The parallel checkout machinery will call checkout_entry() for entries that could not be written in parallel due to path collisions. At this point, we will already be holding the conversion attributes for each entry, and it would be wasteful to let checkout_entry() load these again. Instead, let's add the checkout_entry_ca() variant, which optionally takes a preloaded conv_attrs struct. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 10:34:05 -07:00
Matheus Tavares	30419e7e1d	entry: move conv_attrs lookup up to checkout_entry() In a following patch, checkout_entry() will use conv_attrs to decide whether an entry should be enqueued for parallel checkout or not. But the attributes lookup only happens lower in this call stack. To avoid the unnecessary work of loading the attributes twice, let's move it up to checkout_entry(), and pass the loaded struct down to write_entry(). Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 10:34:05 -07:00
Matheus Tavares	584a0d13f2	entry: extract update_ce_after_write() from write_entry() The code that updates the in-memory index information after an entry is written currently resides in write_entry(). Extract it to a public function so that it can be called by the parallel checkout functions, outside entry.c, in a later patch. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 10:34:05 -07:00
Matheus Tavares	49cfd9032a	entry: make fstat_output() and read_blob_entry() public These two functions will be used by the parallel checkout code, so let's make them public. Note: fstat_output() is renamed to fstat_checkout_output(), now that it has become public, seeking to avoid future name collisions. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 10:34:05 -07:00
Matheus Tavares	d052cc0382	entry: extract a header file for entry.c functions The declarations of entry.c's public functions and structures currently reside in cache.h. Although not many, they contribute to the size of cache.h and, when changed, cause the unnecessary recompilation of modules that don't really use these functions. So let's move them to a new entry.h header. While at it let's also move a comment related to checkout_entry() from entry.c to entry.h as it's more useful to describe the function there. Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 10:34:05 -07:00
ZheNing Hu	2daae3d1d1	commit: add --trailer option Historically, Git has supported the 'Signed-off-by' commit trailer using the '--signoff' and the '-s' option from the command line. But users may need to provide other trailer information from the command line such as "Helped-by", "Reported-by", "Mentored-by", Now implement a new `--trailer <token>[(=\|:)<value>]` option to pass other trailers to `interpret-trailers` and insert them into commit messages. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-23 10:31:38 -07:00
Junio C Hamano	1424303384	The second batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-22 14:00:25 -07:00
Junio C Hamano	3099d4faa3	Merge branch 'bc/clone-bare-with-conflicting-config' "git -c core.bare=false clone --bare ..." would have segfaulted, which has been corrected. * bc/clone-bare-with-conflicting-config: builtin/init-db: handle bare clones when core.bare set to false	2021-03-22 14:00:25 -07:00
Junio C Hamano	d4bda9b045	Merge branch 'jk/filter-branch-sha256' Code clean-up. * jk/filter-branch-sha256: filter-branch: drop $_x40 glob filter-branch: drop multiple-ancestor warning t7003: test ref rewriting explicitly	2021-03-22 14:00:25 -07:00
Junio C Hamano	20adca9006	Merge branch 'ps/update-ref-trans-hook-doc' Doc update. * ps/update-ref-trans-hook-doc: githooks.txt: clarify documentation on reference-transaction hook githooks.txt: replace mentions of SHA-1 specific properties	2021-03-22 14:00:25 -07:00
Junio C Hamano	960f466d1a	Merge branch 'rr/mailmap-entry-self' * rr/mailmap-entry-self: Add entry for Ramkumar Ramachandra	2021-03-22 14:00:25 -07:00
Junio C Hamano	3d92c0a784	Merge branch 'jr/doc-ignore-typofix' Doc cleanup. * jr/doc-ignore-typofix: doc: .gitignore documentation typofix	2021-03-22 14:00:25 -07:00
Junio C Hamano	44e03bfdb6	Merge branch 'sv/t9801-test-path-is-file-cleanup' Test cleanup. * sv/t9801-test-path-is-file-cleanup: t9801: replace test -f with test_path_is_file	2021-03-22 14:00:24 -07:00
Junio C Hamano	c83d602ad2	Merge branch 'dl/cat-file-doc-cleanup' Doc cleanup. * dl/cat-file-doc-cleanup: git-cat-file.txt: remove references to "sha1" git-cat-file.txt: monospace args, placeholders and filenames	2021-03-22 14:00:24 -07:00
Junio C Hamano	25f9326561	Merge branch 'rs/pretty-describe' "git log --format='...'" learned "%(describe)" placeholder. * rs/pretty-describe: archive: expand only a single %(describe) per archive pretty: document multiple %(describe) being inconsistent t4205: assert %(describe) test coverage pretty: add merge and exclude options to %(describe) pretty: add %(describe)	2021-03-22 14:00:24 -07:00
Junio C Hamano	f5c73f69fd	Merge branch 'dl/stash-show-untracked' "git stash show" learned to optionally show untracked part of the stash. * dl/stash-show-untracked: stash show: learn stash.showIncludeUntracked stash show: teach --include-untracked and --only-untracked	2021-03-22 14:00:24 -07:00
Junio C Hamano	dd4048d1c7	Merge branch 'en/ort-perf-batch-8' Rename detection rework continues. * en/ort-perf-batch-8: diffcore-rename: compute dir_rename_guess from dir_rename_counts diffcore-rename: limit dir_rename_counts computation to relevant dirs diffcore-rename: compute dir_rename_counts in stages diffcore-rename: extend cleanup_dir_rename_info() diffcore-rename: move dir_rename_counts into dir_rename_info struct diffcore-rename: add function for clearing dir_rename_count Move computation of dir_rename_count from merge-ort to diffcore-rename diffcore-rename: add a mapping of destination names to their indices diffcore-rename: provide basic implementation of idx_possible_rename() diffcore-rename: use directory rename guided basename comparisons	2021-03-22 14:00:24 -07:00
Junio C Hamano	24119d9d7b	Merge branch 'ab/grep-pcre2-allocfix' Updates to memory allocation code around the use of pcre2 library. * ab/grep-pcre2-allocfix: grep/pcre2: move definitions of pcre2_{malloc,free} grep/pcre2: move back to thread-only PCREv2 structures grep/pcre2: actually make pcre2 use custom allocator grep/pcre2: use pcre2_maketables_free() function grep/pcre2: use compile-time PCREv2 version test grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode grep/pcre2: prepare to add debugging to pcre2_malloc() grep/pcre2: correct reference to grep_init() in comment grep/pcre2: drop needless assignment to NULL grep/pcre2: drop needless assignment + assert() on opt->pcre2	2021-03-22 14:00:23 -07:00
Junio C Hamano	e8d5a423ca	Merge branch 'jk/perf-in-worktrees' Perf test update to work better in secondary worktrees. * jk/perf-in-worktrees: t/perf: avoid copying worktree files from test repo t/perf: handle worktrees as test repos	2021-03-22 14:00:23 -07:00
Junio C Hamano	d20fa3cf9d	Merge branch 'ds/commit-graph-generation-config' A new configuration variable has been introduced to allow choosing which version of the generation number gets used in the commit-graph file. * ds/commit-graph-generation-config: commit-graph: use config to specify generation type commit-graph: create local repository pointer	2021-03-22 14:00:23 -07:00
Junio C Hamano	52182e3b1f	Merge branch 'ab/remote-write-config-in-camel-case' Update C code that sets a few configuration variables when a remote is configured so that it spells configuration variable names in the canonical camelCase. * ab/remote-write-config-in-camel-case: remote: write camel-cased .pushRemote on rename remote: add camel-cased .tagOpt key, like clone	2021-03-22 14:00:23 -07:00
Junio C Hamano	2435feaa20	Merge branch 'mt/cleanly-die-upon-missing-required-filter' We had a code to diagnose and die cleanly when a required clean/smudge filter is missing, but an assert before that unnecessarily fired, hiding the end-user facing die() message. * mt/cleanly-die-upon-missing-required-filter: convert: fail gracefully upon missing clean cmd on required filter	2021-03-22 14:00:22 -07:00
Junio C Hamano	204333b015	Merge branch 'jk/open-dotgitx-with-nofollow' It does not make sense to make ".gitattributes", ".gitignore" and ".mailmap" symlinks, as they are supposed to be usable from the object store (think: bare repositories where HEAD:.mailmap etc. are used). When these files are symbolic links, we used to read the contents of the files pointed by them by mistake, which has been corrected. * jk/open-dotgitx-with-nofollow: mailmap: do not respect symlinks for in-tree .mailmap exclude: do not respect symlinks for in-tree .gitignore attr: do not respect symlinks for in-tree .gitattributes exclude: add flags parameter to add_patterns() attr: convert "macro_ok" into a flags field add open_nofollow() helper	2021-03-22 14:00:22 -07:00
Ævar Arnfjörð Bjarmason	2be927f3d1	diff --no-index tests: test mode normalization When "git diff --no-index X Y" is run the modes of the files being differ are normalized by canon_mode() in fill_filespec(). I recently broke that behavior in a patch of mine[1] which would pass all tests, or not, depending on the umask of the git.git checkout. Let's test for this explicitly. Arguably this should not be the behavior of "git diff --no-index". We aren't diffing our own objects or the index, so it might be useful to show mode differences between files. On the other hand diff(1) does not do that, and it would be needlessly distracting when e.g. diffing an extracted tar archive whose contents is the same, but whose file modes are different. 1. https://lore.kernel.org/git/20210316155829.31242-2-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-22 12:22:26 -07:00
Patrick Steinhardt	540cdc11ad	pack-bitmap: avoid traversal of objects referenced by uninteresting tag When preparing the bitmap walk, we first establish the set of of have and want objects by iterating over the set of pending objects: if an object is marked as uninteresting, it's declared as an object we already have, otherwise as an object we want. These two sets are then used to compute which transitively referenced objects we need to obtain. One special case here are tag objects: when a tag is requested, we resolve it to its first not-tag object and add both resolved objects as well as the tag itself into either the have or want set. Given that the uninteresting-property always propagates to referenced objects, it is clear that if the tag is uninteresting, so are its children and vice versa. But we fail to propagate the flag, which effectively means that referenced objects will always be interesting except for the case where they have already been marked as uninteresting explicitly. This mislabeling does not impact correctness: we now have it in our "wants" set, and given that we later do an `AND NOT` of the bitmaps of "wants" and "haves" sets it is clear that the result must be the same. But we now start to needlessly traverse the tag's referenced objects in case it is uninteresting, even though we know that each referenced object will be uninteresting anyway. In the worst case, this can lead to a complete graph walk just to establish that we do not care for any object. Fix the issue by propagating the `UNINTERESTING` flag to pointees of tag objects and add a benchmark with negative revisions to p5310. This shows some nice performance benefits, tested with linux.git: Test HEAD~ HEAD --------------------------------------------------------------------------------------------------------------- 5310.3: repack to disk 193.18(181.46+16.42) 194.61(183.41+15.83) +0.7% 5310.4: simulated clone 25.93(24.88+1.05) 25.81(24.73+1.08) -0.5% 5310.5: simulated fetch 2.64(5.30+0.69) 2.59(5.16+0.65) -1.9% 5310.6: pack to file (bitmap) 58.75(57.56+6.30) 58.29(57.61+5.73) -0.8% 5310.7: rev-list (commits) 1.45(1.18+0.26) 1.46(1.22+0.24) +0.7% 5310.8: rev-list (objects) 15.35(14.22+1.13) 15.30(14.23+1.07) -0.3% 5310.9: rev-list with tag negated via --not --all (objects) 22.49(20.93+1.56) 0.11(0.09+0.01) -99.5% 5310.10: rev-list with negative tag (objects) 0.61(0.44+0.16) 0.51(0.35+0.16) -16.4% 5310.11: rev-list count with blob:none 12.15(11.19+0.96) 12.18(11.19+0.99) +0.2% 5310.12: rev-list count with blob:limit=1k 17.77(15.71+2.06) 17.75(15.63+2.12) -0.1% 5310.13: rev-list count with tree:0 1.69(1.31+0.38) 1.68(1.28+0.39) -0.6% 5310.14: simulated partial clone 20.14(19.15+0.98) 19.98(18.93+1.05) -0.8% 5310.16: clone (partial bitmap) 12.78(13.89+1.07) 12.72(13.99+1.01) -0.5% 5310.17: pack to file (partial bitmap) 42.07(45.44+2.72) 41.44(44.66+2.80) -1.5% 5310.18: rev-list with tree filter (partial bitmap) 0.44(0.29+0.15) 0.46(0.32+0.14) +4.5% While most benchmarks are probably in the range of noise, the newly added 5310.9 and 5310.10 benchmarks consistenly perform better. Signed-off-by: Patrick Steinhardt <ps@pks.im>. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-22 12:10:56 -07:00
Christopher Schenk	1b0d9545bb	remote-curl: fall back to basic auth if Negotiate fails When the username and password are supplied in a url like this https://myuser:secret@git.exampe/myrepo.git and the server supports the negotiate authenticaten method, git does not fall back to basic auth and libcurl hardly tries to authenticate with the negotiate method. Stop using the Negotiate authentication method after the first failure because if it fails on the first try it will never succeed. Signed-off-by: Christopher Schenk <christopher@cschenk.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-22 11:55:41 -07:00
Jeff Hostetler	36a7eb6876	t0052: add simple-ipc tests and t/helper/test-simple-ipc tool Create t0052-simple-ipc.sh with unit tests for the "simple-ipc" mechanism. Create t/helper/test-simple-ipc test tool to exercise the "simple-ipc" functions. When the tool is invoked with "run-daemon", it runs a server to listen for "simple-ipc" connections on a test socket or named pipe and responds to a set of commands to exercise/stress the communication setup. When the tool is invoked with "start-daemon", it spawns a "run-daemon" command in the background and waits for the server to become ready before exiting. (This helps make unit tests in t0052 more predictable and avoids the need for arbitrary sleeps in the test script.) The tool also has a series of client "send" commands to send commands and data to a server instance. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-22 11:52:54 -07:00
Jeff Hostetler	7cd5dbcaba	simple-ipc: add Unix domain socket implementation Create Unix domain socket based implementation of "simple-ipc". A set of `ipc_client` routines implement a client library to connect to an `ipc_server` over a Unix domain socket, send a simple request, and receive a single response. Clients use blocking IO on the socket. A set of `ipc_server` routines implement a thread pool to listen for and concurrently service client connections. The server creates a new Unix domain socket at a known location. If a socket already exists with that name, the server tries to determine if another server is already listening on the socket or if the socket is dead. If socket is busy, the server exits with an error rather than stealing the socket. If the socket is dead, the server creates a new one and starts up. If while running, the server detects that its socket has been stolen by another server, it automatically exits. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-22 11:52:54 -07:00
Ævar Arnfjörð Bjarmason	271cb303a5	diff --no-index tests: add test for --exit-code Add a test for --exit-code working with --no-index. There's no reason to suppose it wouldn't, but we weren't testing for it anywhere in our tests. Let's fix that blind spot. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-22 11:48:41 -07:00
Andrzej Hunt	68ffe095a2	transport: also free remote_refs in transport_disconnect() transport_get_remote_refs() can populate the transport struct's remote_refs. transport_disconnect() is already responsible for most of transport's cleanup - therefore we also take care of freeing remote_refs there. There are 2 locations where transport_disconnect() is called before we're done using the returned remote_refs. This patch changes those callsites to only call transport_disconnect() after the returned refs are no longer being used - which is necessary to safely be able to free remote_refs during transport_disconnect(). This commit fixes the following leak which was found while running t0000, but is expected to also fix the same pattern of leak in all locations that use transport_get_remote_refs(): Direct leak of 165 byte(s) in 1 object(s) allocated from: #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8 #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20 #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9 #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8 #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8 #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4 #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9 #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4 #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9 #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-21 14:39:10 -07:00
Andrzej Hunt	64cc539fd2	parse-options: don't leak alias help messages preprocess_options() allocates new strings for help messages for OPTION_ALIAS. Therefore we also need to clean those help messages up when freeing the returned options. First introduced in: `7c280589cf` (parse-options: teach "git cmd -h" to show alias as alias, 2020-03-16) The preprocessed options themselves no longer contain any indication that a given option is/was an alias - therefore we add a new flag to indicate former aliases. (An alternative approach would be to look back at the original options to determine which options are aliases - but that seems like a fragile approach. Or we could even look at the alias_groups list - which might be less fragile, but would be slower as it requires nested looping.) As far as I can tell, parse_options() is only ever used once per command, and the help messages are small - hence this leak has very little impact. This leak was found while running t0001. LSAN output can be found below: Direct leak of 65 byte(s) in 1 object(s) allocated from: #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9aae36 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8 #2 0x939d8d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2 #3 0x93b936 in strbuf_vaddf /home/ahunt/oss-fuzz/git/strbuf.c:392:3 #4 0x93b7ff in strbuf_addf /home/ahunt/oss-fuzz/git/strbuf.c:333:2 #5 0x86747e in preprocess_options /home/ahunt/oss-fuzz/git/parse-options.c:666:3 #6 0x866ed2 in parse_options /home/ahunt/oss-fuzz/git/parse-options.c:847:17 #7 0x51c4a7 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:989:9 #8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #12 0x69c9fe in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #13 0x7fdac42d4349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-21 14:39:10 -07:00
Andrzej Hunt	0171dbcb42	parse-options: convert bitfield values to use binary shift Because it's easier to read, but also likely to be easier to maintain. I am making this change because I need to add a new flag in a later commit. Also add a trailing comma to the last enum entry to simplify addition of new flags. This change was originally suggested by Peff in: https://public-inbox.org/git/YEZ%2FBWWbpfVwl6nO@coredump.intra.peff.net/ Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-21 14:39:10 -07:00
Ævar Arnfjörð Bjarmason	47957485b3	tree.h API: simplify read_tree_recursive() signature Simplify the signature of read_tree_recursive() to omit the "base", "baselen" and "stage" arguments. No callers of it use these parameters for anything anymore. The last function to call read_tree_recursive() with a non-"" path was read_tree_recursive() itself, but that was changed in `ffd31f661d` (Reimplement read_tree_recursive() using tree_entry_interesting(), 2011-03-25). The last user of the "stage" parameter went away in the last commit, and even that use was mere boilerplate. So let's remove those and rename the read_tree_recursive() function to just read_tree(). We had another read_tree() function that I've refactored away in preceding commits, since all in-tree users read trees recursively with a callback we can change the name to signify that this is the norm. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 16:09:26 -07:00
Ævar Arnfjörð Bjarmason	6c9fc42e9f	tree.h API: expose read_tree_1() as read_tree_at() Rename the static read_tree_1() function to read_tree_at(). This function works just like read_tree_recursive(), except you provide your own strbuf. This step doesn't make much sense now, but in follow-up commits I'll remove the base/baselen/stage arguments to read_tree_recursive(). At that point an anticipated in-tree user[1] for the old read_tree_recursive() couldn't provide a path to start the traversal. Let's give them a function to do so with an API that makes more sense for them, by taking a strbuf we should be able to avoid more casting and/or reallocations in the future. 1. https://lore.kernel.org/git/xmqqft106sok.fsf@gitster.g Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 16:09:26 -07:00
Ævar Arnfjörð Bjarmason	7367d88261	archive: stop passing "stage" through read_tree_recursive() The "stage" variable being passed around in the archive code has only ever been an elaborate way to hardcode the value "0". This code was added in its original form in `e4fbbfe9ec` (Add git-zip-tree, 2006-08-26), at which point a hardcoded "0" would be passed down through read_tree_recursive() to write_zip_entry(). It was then diligently added to the "struct directory" in `ed22b4173b` (archive: support filtering paths with glob, 2014-09-21), but we were still not doing anything except passing it around as-is. Let's stop doing that in the code internal to archive.c, we'll still feed "0" to read_tree_recursive() itself, but won't use it. That we're providing it at all to read_tree_recursive() will be changed in a follow-up commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 16:09:26 -07:00
Ævar Arnfjörð Bjarmason	9614ad3ce0	ls-files: refactor away read_tree() Refactor away the read_tree() function into its only user, overlay_tree_on_index(). First, change read_one_entry_opt() to use the strbuf parameter read_tree_recursive() passes down in place. This finishes up a partial refactoring started in `6a0b0b6de9` (tree.c: update read_tree_recursive callback to pass strbuf as base, 2014-11-30). Moving the rest into overlay_tree_on_index() makes this index juggling we're doing easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 16:09:26 -07:00
Ævar Arnfjörð Bjarmason	fcc7c12f11	ls-files: don't needlessly pass around stage variable Now that read_tree() has been moved to ls-files.c we can get rid of the stage != 1 case that'll never happen. Let's not use read_tree_recursive() as a pass-through to pass "stage = 1" either. For now we'll pass an unused "stage = 0" for consistency with other read_tree_recursive() callers, that argument will be removed in a follow-up commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 16:09:26 -07:00
Ævar Arnfjörð Bjarmason	eefadd18e1	tree.c API: move read_tree() into builtin/ls-files.c Since the read_tree() API was added around the same time as read_tree_recursive() in `94537c78a8` (Move "read_tree()" to "tree.c"[...], 2005-04-22) and `b12ec373b8` ([PATCH] Teach read-tree about commit objects, 2005-04-20) things have gradually migrated over to the read_tree_recursive() version. Now builtin/ls-files.c is the last user of this code, let's move all the relevant code there. This allows for subsequent simplification of it, and an eventual move to read_tree_recursive(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 16:09:25 -07:00
Ævar Arnfjörð Bjarmason	8de78218c5	ls-files tests: add meaningful --with-tree tests Add tests for "ls-files --with-tree". There was effectively no coverage for any normal usage of this command, only the tests added in `54e1abce90` (Add test case for ls-files --with-tree, 2007-10-03) for an obscure bug. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 16:09:25 -07:00
Ævar Arnfjörð Bjarmason	dcc0a86f2f	show tests: add test for "git show <tree>" Add missing tests for showing a tree with "git show". Let's test for showing a tree, two trees, and that doing so doesn't recurse. The only tests for this code added in `5d7eeee2ac` (git-show: grok blobs, trees and tags, too, 2006-12-14) were the tests in t7701-repack-unpack-unreachable.sh added in `ccc1297226` (repack: modify behavior of -A option to leave unreferenced objects unpacked, 2008-05-09). Let's add this common mode of operation to the "show" tests themselves. It's more obvious, and the tests in t7701-repack-unpack-unreachable.sh happily pass if we start buggily emitting trees recursively. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 16:09:25 -07:00
Elijah Newren	f3b964a07e	Add testing with merge-ort merge strategy In preparation for switching from merge-recursive to merge-ort as the default strategy, have the testsuite default to running with merge-ort. Keep coverage of the recursive backend by having the linux-gcc job run with it. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	259490e572	t6423: mark remaining expected failure under merge-ort as such When we started on merge-ort, thousands of tests failed when run with the GIT_TEST_MERGE_ALGORITHM=ort flag; with so many, it didn't make sense to flip all their test expectations. The ones in t6409, t6418, and the submodule tests are being handled by an independent in-flight topic ("Complete merge-ort implemenation...almost"). The ones in t6423 were left out of the other series because other ongoing series that this commit depends upon were addressing those. Now that we only have one remaining test failure in t6423, let's mark it as such. This remaining test will be fixed by a future optimization series, but since merge-recursive doesn't pass this test either, passing it is not necessary for declaring merge-ort ready for general use. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	41376b58e6	Revert "merge-ort: ignore the directory rename split conflict for now" This reverts commit `5ced7c3da0`, which was put in place as a temporary measure to avoid optimizations unstably erroring out on no destination having a majority of the necessary renames for directories that had no new files and thus no need for directory rename detection anyway. Now that optimizations are in place to prevent us from trying to compute directory rename count computations for directories that do not need it, we can undo this temporary measure. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	816147e7ba	merge-recursive: add a bunch of FIXME comments documenting known bugs The plan is to just delete merge-recursive, but not until everyone is comfortable with merge-ort as a replacement. Given that I haven't switched all callers of merge-recursive over yet (e.g. git-am still uses merge-recursive), maybe there's some value documenting known bugs in the algorithm in case we end up keeping it or someone wants to dig it up in the future. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	5291828df8	merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict There are a variety of questions users might ask while resolving conflicts: * What changes have been made since the previous (first) parent? * What changes are staged? * What is still unstaged? (or what is still conflicted?) * What changes did I make to resolve conflicts so far? The first three of these have simple answers: * git diff HEAD * git diff --cached * git diff There was no way to answer the final question previously. Adding one is trivial in merge-ort, since it works by creating a tree representing what should be written to the working copy complete with conflict markers. Simply write that tree to .git/AUTO_MERGE, allowing users to answer the fourth question with * git diff AUTO_MERGE I avoided using a name like "MERGE_AUTO", because that would be merge-specific (much like MERGE_HEAD, REBASE_HEAD, REVERT_HEAD, CHERRY_PICK_HEAD) and I wanted a name that didn't change depending on which type of operation the merge was part of. Ensure that paths which clean out other temporary operation-specific files (e.g. CHERRY_PICK_HEAD, MERGE_MSG, rebase-merge/ state directory) also clean out this AUTO_MERGE file. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	aa2faac03a	t: mark several submodule merging tests as fixed under merge-ort merge-ort handles submodules (and directory/file conflicts in general) differently than merge-recursive does; it basically puts all the special handling for different filetypes into one place in the codebase instead of needing special handling for different filetypes in many different code paths. This one code path in merge-ort could perhaps use some work still (there are still test_expect_failure cases in the testsuite), but it passes all the tests that merge-recursive does as well as 12 additional ones that merge-recursive fails. Mark those 12 tests as test_expect_success under merge-ort. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	66b209b86a	merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries When merge conflicts occur in paths removed by a sparse-checkout, we need to unsparsify those paths (clear the SKIP_WORKTREE bit), and write out the conflicted file to the working copy. In the very unlikely case that someone manually put a file into the working copy at the location of the SKIP_WORKTREE file, we need to avoid overwriting whatever edits they have made and move that file to a different location first. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	8ddc20b896	t6428: new test for SKIP_WORKTREE handling and conflicts If there is a conflict during a merge for a SKIP_WORKTREE entry, we expect that file to be written to the working copy and have the SKIP_WORKTREE bit cleared in the index. If the user had manually created a file in the working tree despite SKIP_WORKTREE being set, we do not want to clobber their changes to that file, but want to move it out of the way. Add tests that check for these behaviors. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	3639dfb3a8	merge-ort: support subtree shifting merge-recursive has some simple code to support subtree shifting; copy it over to merge-ort. This fixes t6409.12 under GIT_TEST_MERGE_ALGORITHM=ort. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	3860220bfa	merge-ort: let renormalization change modify/delete into clean delete When we have a modify/delete conflict, but the only change to the modification is e.g. change of line endings, then if renormalization is requested then we should be able to recognize such a case as a not-modified/delete and resolve the conflict automatically. This fixes t6418.10 under GIT_TEST_MERGE_ALGORITHM=ort. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	1218b3ab86	merge-ort: have ll_merge() use a special attr_index for renormalization ll_merge() needs an index when renormalization is requested. Create one specifically for just this purpose with just the one needed entry. This fixes t6418.4 and t6418.5 under GIT_TEST_MERGE_ALGORITHM=ort. NOTE 1: Even if the user has a working copy or a real index (which is not a given as merge-ort can be used in bare repositories), we explicitly ignore any .gitattributes file from either of these locations. merge-ort can be used to merge two branches that are unrelated to HEAD, so .gitattributes from the working copy and current index should not be considered relevant. NOTE 2: Since we are in the middle of merging, there is a risk that .gitattributes itself is conflicted...leaving us with an ill-defined situation about how to perform the rest of the merge. It could be that the .gitattributes file does not even exist on one of the sides of the merge, or that it has been modified on both sides. If it's been modified on both sides, it's possible that it could itself be merged cleanly, though it's also possible that it only merges cleanly if you use the right version of the .gitattributes file to drive the merge. It gets kind of complicated. The only test we ever had that attempted to test behavior in this area was seemingly unaware of the undefined behavior, but knew the test wouldn't work for lack of attribute handling support, marked it as test_expect_failure from the beginning, but managed to fail for several reasons unrelated to attribute handling. See commit `6f6e7cfb52` ("t6038: remove problematic test", 2020-08-03) for details. So there are probably various ways to improve what initialize_attr_index() picks in the case of a conflicted .gitattributes but for now I just implemented something simple -- look for whatever .gitattributes file we can find in any of the higher order stages and use it. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	ea305a68fd	merge-ort: add a special minimal index just for renormalization renormalize_buffer() requires an index_state, which is something that merge-ort does not operate with. However, all the renormalization code needs is an index with a .gitattributes file...plus a little bit of setup. Create such an index, along with the deallocation and attr_direction handling. A subsequent commit will add a function to finish the initialization of this index. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	72b3091040	merge-ort: use STABLE_QSORT instead of QSORT where required rename/rename conflict handling depends on the fact that if both sides renamed the same path, that the one on the MERGE_SIDE1 will appear first in the combined diff_queue_struct passed to process_renames(). Since we add all pairs from MERGE_SIDE1 to combined first, and then all pairs from MERGE_SIDE2, and then sort based on filename, this will only be true if the sort used is stable. This was found due to the fact that Mac, unlike Linux, apparently has a system-defined qsort that is not stable. While we are at it, review the other callers of QSORT and add comments about why they can remain as calls to QSORT instead of being modified to call STABLE_QSORT. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:39 -07:00
Junio C Hamano	ef486a9ecf	Merge branch 'tb/git-mv-icase-fix' Fix a corner case bug in "git mv" on case insensitive systems, which was introduced in 2.29 timeframe. * tb/git-mv-icase-fix: git mv foo FOO ; git mv foo bar gave an assert	2021-03-19 15:25:40 -07:00
Junio C Hamano	98164e9585	The first batch in 2.32 cycle Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-19 15:25:40 -07:00
Junio C Hamano	bfcc6e2a68	Merge branch 'rs/xcalloc-takes-nelem-first' Code cleanup. * rs/xcalloc-takes-nelem-first: fix xcalloc() argument order	2021-03-19 15:25:39 -07:00
Junio C Hamano	af107029b1	Merge branch 'ah/make-fuzz-all-doc-update' Update insn in Makefile comments to run fuzz-all target. * ah/make-fuzz-all-doc-update: Makefile: update 'make fuzz-all' docs to reflect modern clang	2021-03-19 15:25:39 -07:00
Junio C Hamano	c691e918f4	Merge branch 'jk/slimmed-down' Unused code removal. * jk/slimmed-down: vcs-svn: remove header files as well	2021-03-19 15:25:38 -07:00
Junio C Hamano	92ccd7b752	Merge branch 'rs/calloc-array' CALLOC_ARRAY() macro replaces many uses of xcalloc(). * rs/calloc-array: cocci: allow xcalloc(1, size) use CALLOC_ARRAY git-compat-util.h: drop trailing semicolon from macro definition	2021-03-19 15:25:38 -07:00
Junio C Hamano	a8a0ac3234	Merge branch 'rs/avoid-null-statement-after-macro-call' Fix macros that can silently inject unintended null-statements. * rs/avoid-null-statement-after-macro-call: mem-pool: drop trailing semicolon from macro definition block-sha1: drop trailing semicolon from macro definition	2021-03-19 15:25:38 -07:00
Junio C Hamano	948e8ac534	Merge branch 'km/config-doc-typofix' Docfix. * km/config-doc-typofix: config.txt: add missing period	2021-03-19 15:25:38 -07:00
Junio C Hamano	cc930b7472	Merge branch 'jt/clone-unborn-head' Test fix. * jt/clone-unborn-head: t5606: run clone branch name test with protocol v2	2021-03-19 15:25:38 -07:00
Junio C Hamano	1dd4e74522	Merge branch 'js/fsmonitor-unpack-fix' The data structure used by fsmonitor interface was not properly duplicated during an in-core merge, leading to use-after-free etc. * js/fsmonitor-unpack-fix: fsmonitor: do not forget to release the token in `discard_index()` fsmonitor: fix memory corruption in some corner cases	2021-03-19 15:25:37 -07:00
Junio C Hamano	35381b13da	Merge branch 'jk/bisect-peel-tag-fix' "git bisect" reimplemented more in C during 2.30 timeframe did not take an annotated tag as a good/bad endpoint well. This regression has been corrected. * jk/bisect-peel-tag-fix: bisect: peel annotated tags to commits	2021-03-19 15:25:37 -07:00
Junio C Hamano	8779c141da	Merge branch 'jh/fsmonitor-prework' The fsmonitor interface read from its input without making sure there is something to read from. This bug is new in 2.31 timeframe. * jh/fsmonitor-prework: fsmonitor: avoid global-buffer-overflow READ when checking trivial response	2021-03-19 15:25:37 -07:00
Junio C Hamano	eabacfd9cb	Merge branch 'jc/calloc-fix' Code clean-up. * jc/calloc-fix: xcalloc: use CALLOC_ARRAY() when applicable	2021-03-19 15:25:37 -07:00
Taylor Blau	14e7b8344f	builtin/pack-objects.c: ignore missing links with --stdin-packs When 'git pack-objects --stdin-packs' encounters a commit in a pack, it marks it as a starting point of a best-effort reachability traversal that is used to populate the name-hash of the objects listed in the given packs. The traversal expects that it should be able to walk the ancestors of all commits in a pack without issue. Ordinarily this is the case, but it is possible to having missing parents from an unreachable part of the repository. In that case, we'd consider any missing objects in the unreachable portion of the graph to be junk. This should be handled gracefully: since the traversal is best-effort (i.e., we don't strictly need to fill in all of the name-hash fields), we should simply ignore any missing links. This patch does that (by setting the 'ignore_missing_links' bit on the rev_info struct), and ensures we don't regress in the future by adding a test which demonstrates this case. It is a little over-eager, since it will also ignore missing links in reachable parts of the packs (which would indicate a corrupted repository), but '--stdin-packs' is explicitly not about reachability. So this step isn't making anything worse for a repository which contains packs missing reachable objects (since we never drop objects with '--stdin-packs'). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-19 11:19:29 -07:00
Bagas Sanjaya	6534d436a2	INSTALL: note on using Asciidoctor to build doc Note on using Asciidoctor to build documentation suite. Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-19 10:49:20 -07:00
Elijah Newren	9bd342137e	diffcore-rename: determine which relevant_sources are no longer relevant As noted a few commits ago ("diffcore-rename: only compute dir_rename_count for relevant directories"), when a source file rename is used as part of directory rename detection, we need to increment counts for each ancestor directory in dirs_removed with value RELEVANT_FOR_SELF. However, a few commits ago ("diffcore-rename: check if we have enough renames for directories early on"), we may have downgraded all relevant ancestor directories from RELEVANT_FOR_SELF to RELEVANT_FOR_ANCESTOR. For a given file, if no ancestor directory is found in dirs_removed with a value of RELEVANT_FOR_SELF, then we can downgrade relevant_source[PATH] from RELEVANT_LOCATION to RELEVANT_NO_MORE. This means we can skip detecting a rename for that particular path (and any other paths in the same directory). For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 5.680 s ± 0.096 s 5.665 s ± 0.129 s mega-renames: 13.812 s ± 0.162 s 11.435 s ± 0.158 s just-one-mega: 506.0 ms ± 3.9 ms 494.2 ms ± 6.1 ms While this improvement looks rather modest for these testcases (because all the previous optimizations were sufficient to nearly remove all time spent in rename detection already), consider this alternative testcase tweaked from the ones in commit `557ac0350d` as follows <Same initial setup as commit `557ac0350d`, then...> $ git switch -c add-empty-file v5.5 $ >drivers/gpu/drm/i915/new-empty-file $ git add drivers/gpu/drm/i915/new-empty-file $ git commit -m "new file" $ git switch 5.4-rename $ git cherry-pick --strategy=ort add-empty-file For this testcase, we see the following improvement: Before After pick-empty: 1.936 s ± 0.024 s 688.1 ms ± 4.2 ms So roughly a factor of 3 speedup. At $DAYJOB, there was a particular repository and cherry-pick that inspired this optimization; for that case I saw a speedup factor of 7 with this optimization. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 14:32:56 -07:00
Elijah Newren	ec59da6015	merge-ort: record the reason that we want a rename for a file There are two different reasons we might want a rename for a file -- for three-way content merging or as part of directory rename detection. Record the reason. diffcore-rename will potentially be able to filter some of the ones marked as needed only for directory rename detection, if it can determine those directory renames based solely on renames found via exact rename detection and basename-guided rename detection. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 14:32:56 -07:00
Elijah Newren	bf238b7137	diffcore-rename: add computation of number of unknown renames The previous commit can only be effective if we have a computation of the number of paths under a given directory which are still have pending renames, and expected this number to be recorded in the dir_rename_count map under the key UNKNOWN_DIR. Add the code necessary to compute these values. Note that this change means dir_rename_count might have a directory whose only entry (for UNKNOWN_DIR) was removed by the time merge-ort goes to check it. To account for this, merge-ort needs to check for the case where the max count is 0. With this change we are now computing the necessary value for each directory in dirs_removed, but are not using that value anywhere. The next two commits will make use of the values stored in dirs_removed in order to compute whether each relevant_source (that is needed only for directory rename detection) has become unnecessary. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 14:32:56 -07:00
Elijah Newren	0491d39297	diffcore-rename: check if we have enough renames for directories early on As noted in the past few commits, if we can determine that a directory already has enough renames to determine how directory rename detection will be decided for that directory, then we can mark that directory as no longer needing any more renames detected for files underneath it. For such directories, we change the value in the dirs_removed map from RELEVANT_TO_SELF to RELEVANT_FOR_ANCESTOR. A subsequent patch will use this information while iterating over the remaining potential rename sources to mark ones that were only location_relevant as unneeded if no containing directory is still marked as RELEVANT_TO_SELF. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 14:32:56 -07:00
Elijah Newren	e54385b97a	diffcore-rename: only compute dir_rename_count for relevant directories When one side adds files to a directory that the other side renamed, directory rename detection is used to either move the new paths to the newer directory or warn the user about the fact that another path location might be better. If a parent of the given directory had new files added to it, any renames in the current directory are also part of determining where the parent directory is renamed to. Thus, naively, we need to record each rename N times for a path at depth N. However, we can use the additional information added to dirs_removed in the last commit to avoid traversing all N parent directories in many cases. Let's use an example to explain how this works. If we have a path named src/old_dir/a/b/file.c and src/old_dir doesn't exist on one side of history, but the other added a file named src/old_dir/newfile.c, then if one side renamed src/old_dir/a/b/file.c => source/new_dir/a/b/file.c then this file would affect potential directory rename detection counts for src/old_dir/a/b => source/new_dir/a/b src/old_dir/a => source/new_dir/a src/old_dir => source/new_dir src => source adding a weight of 1 to each in dir_rename_counts. However, if src/ exists on both sides of history, then we don't need to track any entries for it in dir_rename_counts. That was implemented previously. What we are adding now, is that if no new files were added to src/old_dir/a or src/old_dir/b, then we don't need to have counts in dir_rename_count for those directories either. In short, we only need to track counts in dir_rename_count for directories whose dirs_removed value is RELEVANT_FOR_SELF. And as soon as we reach a directory that isn't in dirs_removed (signalled by returning the default value of NOT_RELEVANT from strintmap_get()), we can stop looking any further up the directory hierarchy. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 14:32:55 -07:00
Elijah Newren	fb52938eec	merge-ort: record the reason that we want a rename for a directory When one side of history renames a directory, and the other side of history added files to the old directory, directory rename detection is used to warn about the location of the added files so the user can move them to the old directory or keep them with the new one. This sets up three different types of directories: * directories that had new files added to them * directories underneath a directory that had new files added to them * directories where no new files were added to it or any leading path Save this information in dirs_removed; the next several commits will make use of this information. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 14:32:55 -07:00
Elijah Newren	a49b55d52e	merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type As noted in the previous commit, we want to be able to take advantage of the "majority rules" portion of directory rename detection to avoid detecting more renames than necessary. However, for diffcore-rename to take advantage of that, it needs to know whether a rename source file was needed for just directory rename detection reasons, or if it is wanted for potential three-way content merging. Modify relevant_sources from a strset to a strintmap, so we can encode additional information. We also modify dirs_removed from a strset to a strintmap at the same time because trying to determine what files are needed for directory rename detection will require us tracking a bit more information for each directory. This commit only changes the types of the two variables from strset to strintmap; it does not actually store any special values yet and for now only checks for presence of entries in the strintmap. Thus, the code is functionally identical to how it behaved before. Future commits will start associating values with each key for these two maps. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 14:32:55 -07:00
Elijah Newren	ae1db7b31c	diffcore-rename: take advantage of "majority rules" to skip more renames In directory rename detection (when a directory is removed on one side of history and the other side adds new files to that directory), we work to find where the greatest number of files within that directory were renamed to so that the new files can be moved with the majority of the files. Naively, we can just do this by detecting renames for all files within the removed/renamed directory, looking at all the destination directories where files within that directory were moved, and if there is more than one such directory then taking the one with the greatest number of files as the directory where the old directory was renamed to. However, sometimes there are enough renames from exact rename detection or basename-guided rename detection that we have enough information to determine the majority winner already. Add a function meant to compute whether particular renames are still needed based on this majority rules check. The next several commits will then add the necessary infrastructure to get the information we need to compute which additional rename sources we can skip. An important side note for future further optimization: There is a possible improvement to this optimization that I have not yet attempted and will not be included in this series of patches: we could first check whether exact renames provide enough information for us to determine directory renames, and avoid doing basename-guided rename detection on some or all of the RELEVANT_LOCATION files within those directories. In effect, this variant would mean doing the handle_early_known_dir_renames() both after exact rename detection and again after basename-guided rename detection, though it would also mean decrementing the number of "unknown" renames for each rename we found from basename-guided rename detection. Adding this additional check for skippable renames right after exact rename detection might turn out to be valuable, especially for partial clones where it might allow us to download certain source files entirely. However, this particular optimization was actually the last one I did in original implementation order, and by the time I implemented this idea, every testcase I had was sufficiently fast that further optimization was unwarranted. If future testcases arise that tax rename detection more heavily (or perhaps partial clones can benefit from avoiding loading more objects), it may be worth implementing this more involved variant. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 14:32:55 -07:00
Jeff King	27d578d904	t: annotate !PTHREADS tests with !FAIL_PREREQS Some tests in t5300 and t7810 expect us to complain about a "--threads" argument when Git is compiled without pthread support. Running these under GIT_TEST_FAIL_PREREQS produces a confusing failure: we pretend to the tests that there is no pthread support, so they expect the warning, but of course the actual build is perfectly happy to respect the --threads argument. We never noticed before the recent `a926c4b904` (tests: remove most uses of C_LOCALE_OUTPUT, 2021-02-11), because the tests also were marked as requiring the C_LOCALE_OUTPUT prerequisite. Which means they'd never have run in FAIL_PREREQS mode, since it would always pretend that the locale prereq was not satisfied. These tests can't possibly work in this mode; it is a mismatch between what the tests expect and what the build was told to do. So let's just mark them to be skipped, using the special prereq introduced by `dfe1a17df9` (tests: add a special setup where prerequisites fail, 2019-05-13). Reported-by: Son Luong Ngoc <sluongng@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 14:17:30 -07:00
Jeff Hostetler	f59d15bb42	convert: add classification for conv_attrs struct Create `enum conv_attrs_classification` to express the different ways that attributes are handled for a blob during checkout. This will be used in a later commit when deciding whether to add a file to the parallel or delayed queue during checkout. For now, we can also use it in get_stream_filter_ca() to simplify the function (as the classifying logic is the same). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 13:56:40 -07:00
Jeff Hostetler	3e9e82c0d8	convert: add get_stream_filter_ca() variant Like the previous patch, we will also need to call get_stream_filter() with a precomputed `struct conv_attrs`, when we add support for parallel checkout workers. So add the _ca() variant which takes the conversion attributes struct as a parameter. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 13:56:40 -07:00
Jeff Hostetler	55b4ad0ead	convert: add [async_]convert_to_working_tree_ca() variants Separate the attribute gathering from the actual conversion by adding _ca() variants of the conversion functions. These variants receive a precomputed 'struct conv_attrs', not relying, thus, on an index state. They will be used in a future patch adding parallel checkout support, for two reasons: - We will already load the conversion attributes in checkout_entry(), before conversion, to decide whether a path is eligible for parallel checkout. Therefore, it would be wasteful to load them again later, for the actual conversion. - The parallel workers will be responsible for reading, converting and writing blobs to the working tree. They won't have access to the main process' index state, so they cannot load the attributes. Instead, they will receive the preloaded ones and call the _ca() variant of the conversion functions. Furthermore, the attributes machinery is optimized to handle paths in sequential order, so it's better to leave it for the main process, anyway. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 13:56:40 -07:00
Jeff Hostetler	38e95844e8	convert: make convert_attrs() and convert structs public Move convert_attrs() declaration from convert.c to convert.h, together with the conv_attrs struct and the crlf_action enum. This function and the data structures will be used outside convert.c in the upcoming parallel checkout implementation. Note that crlf_action is renamed to convert_crlf_action, which is more appropriate for the global namespace. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 13:56:40 -07:00
Nipunn Koorapati	7e5aa13d2c	fsmonitor: add perf test for git diff HEAD Update the xargs call so that if your large repo contains symlinks, test-tool chmtime failure does not end the script. On Linux Test this tree upstream/master --------------------------------------------------------------------------------------------------------- 7519.4: status (fsmonitor=fsmonitor-watchman) 0.52(0.43+0.10) 0.53(0.49+0.05) +1.9% 7519.5: status -uno (fsmonitor=fsmonitor-watchman) 0.21(0.15+0.07) 0.22(0.13+0.09) +4.8% 7519.6: status -uall (fsmonitor=fsmonitor-watchman) 1.65(0.93+0.71) 1.69(1.03+0.65) +2.4% 7519.7: status (dirty) (fsmonitor=fsmonitor-watchman) 11.99(11.34+1.58) 11.95(11.02+1.79) -0.3% 7519.8: diff (fsmonitor=fsmonitor-watchman) 0.25(0.17+0.26) 0.25(0.18+0.26) +0.0% 7519.9: diff HEAD (fsmonitor=fsmonitor-watchman) 0.39(0.25+0.34) 0.89(0.35+0.74) +128.2% 7519.10: diff -- 0_files (fsmonitor=fsmonitor-watchman) 0.16(0.13+0.04) 0.16(0.12+0.05) +0.0% 7519.11: diff -- 10_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.12: diff -- 100_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.13: diff -- 1000_files (fsmonitor=fsmonitor-watchman) 0.16(0.11+0.06) 0.16(0.12+0.05) +0.0% 7519.14: diff -- 10000_files (fsmonitor=fsmonitor-watchman) 0.18(0.13+0.06) 0.17(0.10+0.08) -5.6% 7519.15: add (fsmonitor=fsmonitor-watchman) 2.25(1.53+0.68) 2.25(1.47+0.74) +0.0% 7519.18: status (fsmonitor=disabled) 0.88(0.73+1.03) 0.89(0.67+1.08) +1.1% 7519.19: status -uno (fsmonitor=disabled) 0.45(0.43+0.89) 0.45(0.34+0.98) +0.0% 7519.20: status -uall (fsmonitor=disabled) 1.88(1.16+1.58) 1.88(1.22+1.51) +0.0% 7519.21: status (dirty) (fsmonitor=disabled) 7.53(7.05+2.11) 7.53(6.98+2.04) +0.0% 7519.22: diff (fsmonitor=disabled) 0.42(0.37+0.92) 0.42(0.38+0.91) +0.0% 7519.23: diff HEAD (fsmonitor=disabled) 0.44(0.41+0.90) 0.44(0.40+0.91) +0.0% 7519.24: diff -- 0_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.09+0.05) +0.0% 7519.25: diff -- 10_files (fsmonitor=disabled) 0.13(0.10+0.04) 0.13(0.10+0.04) +0.0% 7519.26: diff -- 100_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.10+0.04) +0.0% 7519.27: diff -- 1000_files (fsmonitor=disabled) 0.13(0.09+0.06) 0.13(0.09+0.05) +0.0% 7519.28: diff -- 10000_files (fsmonitor=disabled) 0.14(0.11+0.05) 0.14(0.10+0.05) +0.0% 7519.29: add (fsmonitor=disabled) 2.43(1.61+1.64) 2.43(1.69+1.57) +0.0% On linux (2.29.2 vs w/ this patch): nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff 2>&1 \| grep lstat 0.04 0.000063 3 20 6 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff HEAD 2>&1 \| grep lstat 94.98 5.242262 10 523783 13 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff 2>&1 \| grep lstat 0.38 0.000032 5 7 3 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff HEAD 2>&1 \| grep lstat 99.44 0.741892 9 81634 10 lstat On mac (2.29.2 vs w/ this patch): nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff 2>&1 \| grep "^lstat64 " lstat64 8 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff HEAD 2>&1 \| grep "^lstat64 " lstat64 120242 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff 2>&1 \| grep "^lstat64 " lstat64 4 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff HEAD 2>&1 \| grep "^lstat64 " lstat64 4497 There are still a bunch of lstats - on directories, but not every file. Progress! Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 13:31:14 -07:00
Nipunn Koorapati	0ec9949f78	fsmonitor: add assertion that fsmonitor is valid to check_removed Validate that fsmonitor is valid to futureproof against bugs where check_removed might be called from places that haven't refreshed. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 13:31:13 -07:00
Nipunn Koorapati	4f3d6d0261	fsmonitor: skip lstat deletion check during git diff-index Teach git to honor fsmonitor rather than issuing an lstat when checking for dirty local deletes. Eliminates O(files) lstats during `git diff HEAD` Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 13:31:11 -07:00
Matheus Tavares	fab78a0c3d	checkout: don't follow symlinks when removing entries At `1d718a5108` ("do not overwrite untracked symlinks", 2011-02-20), symlink.c:check_leading_path() started returning different codes for FL_ENOENT and FL_SYMLINK. But one of its callers, unlink_entry(), was not adjusted for this change, so it started to follow symlinks on the leading path of to-be-removed entries. Fix that and add a regression test. Note that since `1d718a5108` check_leading_path() no longer differentiates the case where it found a symlink in the path's leading components from the cases where it found a regular file or failed to lstat() the component. So, a side effect of this current patch is that unlink_entry() now returns early in all of these three cases. And because we no longer try to unlink such paths, we also don't get the warning from remove_or_warn(). For the regular file and symlink cases, it's questionable whether the warning was useful in the first place: unlink_entry() removes tracked paths that should no longer be present in the state we are checking out to. If the path had its leading dir replaced by another file, it means that the basename already doesn't exist, so there is no need for a warning. Sure, we are leaving a regular file or symlink behind at the path's dirname, but this file is either untracked now (so again, no need to warn), or it will be replaced by a tracked file during the next phase of this checkout operation. As for failing to lstat() one of the leading components, the basename might still exist only we cannot unlink it (e.g. due to the lack of the required permissions). Since the user expect it to be removed (especially with checkout's --no-overlay option), add back the warning in this more relevant case. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 12:58:10 -07:00
Matheus Tavares	462b4e8dfd	symlinks: update comment on threaded_check_leading_path() Since `1d718a5108` ("do not overwrite untracked symlinks", 2011-02-20), the comment on top of threaded_check_leading_path() is outdated and no longer reflects the behavior of this function. Let's updated it to avoid confusions. While we are here, also remove some duplicated comments to avoid similar maintenance problems. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 12:58:08 -07:00
Ævar Arnfjörð Bjarmason	fb79f5bff7	fsck.c: refactor and rename common config callback Refactor code I recently changed in `1f3299fda9` (fsck: make fsck_config() re-usable, 2021-01-05) so that I could use fsck's config callback in mktag in `1f3299fda9` (fsck: make fsck_config() re-usable, 2021-01-05). I don't know what I was thinking in structuring the code this way, but it clearly makes no sense to have an fsck_config_internal() at all just so it can get a fsck_options when git_config() already supports passing along some void* data. Let's just make use of that instead, which gets us rid of the two wrapper functions, and brings fsck's common config callback in line with other such reusable config callbacks. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 14:02:43 -07:00
Johannes Schindelin	4abc57848d	fsmonitor: do not forget to release the token in `discard_index()` In `56c6910028` (fsmonitor: change last update timestamp on the index_state to opaque token, 2020-01-07), we forgot to adjust `discard_index()` to release the "last-update" token: it is no longer a 64-bit number, but a free-form string that has been allocated. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 12:19:28 -07:00
Johannes Schindelin	3dfd30598b	fsmonitor: fix memory corruption in some corner cases In `56c6910028` (fsmonitor: change last update timestamp on the index_state to opaque token, 2020-01-07), we forgot to adjust the part of `unpack_trees()` that copies the FSMonitor "last-update" information that we copy from the source index to the result index since `679f2f9fdd` (unpack-trees: skip stat on fsmonitor-valid files, 2019-11-20). Since the "last-update" information is no longer a 64-bit number, but a free-form string that has been allocated, we need to duplicate it rather than just copying it. This is important because there _are_ cases when `unpack_trees()` will perform a oneway merge that implicitly calls `refresh_fsmonitor()` (which will allocate that "last-update" token). This happens _after_ that token was copied into the result index. However, we _then_ call `check_updates()` on that index, which will _also_ call `refresh_fsmonitor()`, accessing the "last-update" string, which by now would be released already. In the instance that lead to this patch, this caused a segmentation fault during a lengthy, complicated rebase involving the todo command `reset` that (crucially) had to updated many files. Unfortunately, it seems very hard to trigger that crash, therefore this patch is not accompanied by a regression test. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 12:19:26 -07:00
Kyle Meyer	cfd409ed09	config.txt: add missing period Signed-off-by: Kyle Meyer <kyle@kyleam.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 11:25:15 -07:00
Jeff King	7730f85594	bisect: peel annotated tags to commits This patch fixes a bug where git-bisect doesn't handle receiving annotated tags as "git bisect good <tag>", etc. It's a regression in `27257bc466` (bisect--helper: reimplement `bisect_state` & `bisect_head` shell functions in C, 2020-10-15). The original shell code called: sha=$(git rev-parse --verify "$rev^{commit}") \|\| die "$(eval_gettext "Bad rev input: \$rev")" which will peel the input to a commit (or complain if that's not possible). But the C code just calls get_oid(), which will yield the oid of the tag. The fix is to peel to a commit. The error message here is a little non-idiomatic for Git (since it starts with a capital). I've mostly left it, as it matches the other converted messages (like the "Bad rev input" we print when get_oid() fails), though I did add an indication that it was the peeling that was the problem. It might be worth taking a pass through this converted code to modernize some of the error messages. Note also that the test does a bare "grep" (not i18ngrep) on the expected "X is the first bad commit" output message. This matches the rest of the test script. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 11:24:08 -07:00
Jonathan Tan	5f70859c15	t5606: run clone branch name test with protocol v2 `4f37d45706` ("clone: respect remote unborn HEAD", 2021-02-05) introduces a new feature (if the remote has an unborn HEAD, e.g. when the remote repository is empty, use it as the name of the branch) that only works in protocol v2, but did not ensure that one of its tests always uses protocol v2, and thus that test would fail if GIT_TEST_PROTOCOL_VERSION=0 (or 1) is used. Therefore, add "-c protocol.version=2" to the appropriate test. (The rest of the tests from that commit have "-c protocol.version=2" already added.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 11:19:36 -07:00
René Scharfe	116affac3f	mem-pool: drop trailing semicolon from macro definition Allow BLOCK_GROWTH_SIZE to be used like an integer literal by removing the trailing semicolon from its definition. Also wrap the expression in parentheses, to allow it to be used with operators without leading to unexpected results. It doesn't matter for the current use site, but make it follow standard macro rules anyway to avoid future surprises. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 10:20:16 -07:00
René Scharfe	3d8cbbf2c3	block-sha1: drop trailing semicolon from macro definition `23119ffb4e` (block-sha1: put expanded macro parameters in parentheses, 2012-07-22) added a trailing semicolon to the definition of SHA_MIX without explanation. It doesn't matter with the current code, but make sure to avoid potential surprises by removing it again. This allows the macro to be used almost like a function: Users can combine it with operators of their choice, but still must not pass an expression with side-effects as a parameter, as it would be evaluated multiple times. Signed-off-by: René Scharfe <l.s.r@web.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 10:20:01 -07:00
Andrzej Hunt	097ea2c848	fsmonitor: avoid global-buffer-overflow READ when checking trivial response query_result can be be an empty strbuf (STRBUF_INIT) - in that case trying to read 3 bytes triggers a buffer overflow read (as query_result.buf = '\0'). Therefore we need to check query_result's length before trying to read 3 bytes. This overflow was introduced in: `940b94f35c` (fsmonitor: log invocation of FSMonitor hook to trace2, 2021-02-03) It was found when running the test-suite against ASAN, and can be most easily reproduced with the following command: make GIT_TEST_OPTS="-v" DEFAULT_TEST_TARGET="t7519-status-fsmonitor.sh" \ SANITIZE=address DEVELOPER=1 test ==2235==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0000019e6e5e at pc 0x00000043745c bp 0x7fffd382c520 sp 0x7fffd382bcc8 READ of size 3 at 0x0000019e6e5e thread T0 #0 0x43745b in MemcmpInterceptorCommon(void, int ()(void const, void const, unsigned long), void const, void const, unsigned long) /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:842:7 #1 0x43786d in bcmp /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:887:10 #2 0x80b146 in fsmonitor_is_trivial_response /home/ahunt/oss-fuzz/git/fsmonitor.c:192:10 #3 0x80b146 in query_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:175:7 #4 0x80a749 in refresh_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:267:21 #5 0x80bad1 in tweak_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:429:4 #6 0x90f040 in read_index_from /home/ahunt/oss-fuzz/git/read-cache.c:2321:3 #7 0x8e5d08 in repo_read_index_preload /home/ahunt/oss-fuzz/git/preload-index.c:164:15 #8 0x52dd45 in prepare_index /home/ahunt/oss-fuzz/git/builtin/commit.c:363:6 #9 0x52a188 in cmd_commit /home/ahunt/oss-fuzz/git/builtin/commit.c:1588:15 #10 0x4ce77e in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #11 0x4ccb18 in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #12 0x4cb01c in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #13 0x4cb01c in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #14 0x6aca8d in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #15 0x7fb027bf5349 in __libc_start_main (/lib64/libc.so.6+0x24349) #16 0x4206b9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120 0x0000019e6e5e is located 2 bytes to the left of global variable 'strbuf_slopbuf' defined in 'strbuf.c:51:6' (0x19e6e60) of size 1 'strbuf_slopbuf' is ascii string '' 0x0000019e6e5e is located 126 bytes to the right of global variable 'signals' defined in 'sigchain.c:11:31' (0x19e6be0) of size 512 SUMMARY: AddressSanitizer: global-buffer-overflow /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:842:7 in MemcmpInterceptorCommon(void, int ()(void const, void const, unsigned long), void const, void const, unsigned long) Shadow bytes around the buggy address: 0x000080334d70: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 0x000080334d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080334d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080334da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080334db0: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9 =>0x000080334dc0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9[f9]01 f9 f9 f9 0x000080334dd0: f9 f9 f9 f9 03 f9 f9 f9 f9 f9 f9 f9 02 f9 f9 f9 0x000080334de0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 0x000080334df0: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 0x000080334e00: f9 f9 f9 f9 00 00 00 00 f9 f9 f9 f9 01 f9 f9 f9 0x000080334e10: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 10:00:20 -07:00
Junio C Hamano	1c57cc70ec	cocci: allow xcalloc(1, size) Allocating a pre-cleared single element is quite common and it is misleading to use CALLOC_ARRAY(); these allocations that would be affected without this change are not allocating an array. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 17:56:07 -07:00
Junio C Hamano	486f4bd183	xcalloc: use CALLOC_ARRAY() when applicable These are for codebase before Git 2.31 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 17:51:10 -07:00
Jeff Hostetler	9fd1902762	unix-stream-server: create unix domain socket under lock Create a wrapper class for `unix_stream_listen()` that uses a ".lock" lockfile to create the unix domain socket in a race-free manner. Unix domain sockets have a fundamental problem on Unix systems because they persist in the filesystem until they are deleted. This is independent of whether a server is actually listening for connections. Well-behaved servers are expected to delete the socket when they shutdown. A new server cannot easily tell if a found socket is attached to an active server or is leftover cruft from a dead server. The traditional solution used by `unix_stream_listen()` is to force delete the socket pathname and then create a new socket. This solves the latter (cruft) problem, but in the case of the former, it orphans the existing server (by stealing the pathname associated with the socket it is listening on). We cannot directly use a .lock lockfile to create the socket because the socket is created by `bind(2)` rather than the `open(2)` mechanism used by `tempfile.c`. As an alternative, we hold a plain lockfile ("<path>.lock") as a mutual exclusion device. Under the lock, we test if an existing socket ("<path>") is has an active server. If not, we create a new socket and begin listening. Then we use "rollback" to delete the lockfile in all cases. This wrapper code conceptually exists at a higher-level than the core unix_stream_connect() and unix_stream_listen() routines that it consumes. It is isolated in a wrapper class for clarity. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:51 -07:00
Jeff Hostetler	77e522caae	unix-socket: disallow chdir() when creating unix domain sockets Calls to `chdir()` are dangerous in a multi-threaded context. If `unix_stream_listen()` or `unix_stream_connect()` is given a socket pathname that is too long to fit in a `sockaddr_un` structure, it will `chdir()` to the parent directory of the requested socket pathname, create the socket using a relative pathname, and then `chdir()` back. This is not thread-safe. Teach `unix_sockaddr_init()` to not allow calls to `chdir()` when this flag is set. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:51 -07:00
Jeff Hostetler	55144ccb0a	unix-socket: add backlog size option to unix_stream_listen() Update `unix_stream_listen()` to take an options structure to override default behaviors. This commit includes the size of the `listen()` backlog. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:51 -07:00
Jeff Hostetler	4f98ce5865	unix-socket: eliminate static unix_stream_socket() helper function The static helper function `unix_stream_socket()` calls `die()`. This is not appropriate for all callers. Eliminate the wrapper function and make the callers propagate the error. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:51 -07:00
Jeff Hostetler	59c7b88198	simple-ipc: add win32 implementation Create Windows implementation of "simple-ipc" using named pipes. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:50 -07:00
Jeff Hostetler	066d5234d0	simple-ipc: design documentation for new IPC mechanism Brief design documentation for new IPC mechanism allowing foreground Git client to talk with an existing daemon process at a known location using a named pipe or unix domain socket. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:50 -07:00
Johannes Schindelin	8c2efa5d76	pkt-line: add options argument to read_packetized_to_strbuf() Update the calling sequence of `read_packetized_to_strbuf()` to take an options argument and not assume a fixed set of options. Update the only existing caller accordingly to explicitly pass the formerly-assumed flags. The `read_packetized_to_strbuf()` function calls `packet_read()` with a fixed set of assumed options (`PACKET_READ_GENTLE_ON_EOF`). This assumption has been fine for the single existing caller `apply_multi_file_filter()` in `convert.c`. In a later commit we would like to add other callers to `read_packetized_to_strbuf()` that need a different set of options. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:50 -07:00
Johannes Schindelin	c4ba579397	pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option Introduce PACKET_READ_GENTLE_ON_READ_ERROR option to help libify the packet readers. So far, the (possibly indirect) callers of `get_packet_data()` can ask that function to return an error instead of `die()`ing upon end-of-file. However, random read errors will still cause the process to die. So let's introduce an explicit option to tell the packet reader machinery to please be nice and only return an error on read errors. This change prepares pkt-line for use by long-running daemon processes. Such processes should be able to serve multiple concurrent clients and and survive random IO errors. If there is an error on one connection, a daemon should be able to drop that connection and continue serving existing and future connections. This ability will be used by a Git-aware "Builtin FSMonitor" feature in a later patch series. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:50 -07:00
Johannes Schindelin	3a63c6a48c	pkt-line: do not issue flush packets in write_packetized_() Remove the `packet_flush_gently()` call in `write_packetized_from_buf() and `write_packetized_from_fd()` and require the caller to call it if desired. Rename both functions to `write_packetized_from__no_flush()` to prevent later merge accidents. `write_packetized_from_buf()` currently only has one caller: `apply_multi_file_filter()` in `convert.c`. It always wants a flush packet to be written after writing the payload. However, we are about to introduce a caller that wants to write many packets before a final flush packet, so let's make the caller responsible for emitting the flush packet. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:50 -07:00
Jeff Hostetler	7455e05e4e	pkt-line: eliminate the need for static buffer in packet_write_gently() Teach `packet_write_gently()` to write the pkt-line header and the actual buffer in 2 separate calls to `write_in_full()` and avoid the need for a static buffer, thread-safe scratch space, or an excessively large stack buffer. Change `write_packetized_from_fd()` to allocate a temporary buffer rather than using a static buffer to avoid similar issues here. These changes are intended to make it easier to use pkt-line routines in a multi-threaded context with multiple concurrent writers writing to different streams. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:50 -07:00
Charvi Mendiratta	00ea64ed7a	doc/git-commit: add documentation for fixup=[amend\|reword] options Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:29:36 -07:00
Charvi Mendiratta	8bedae4599	t3437: use --fixup with options to create amend! commit We taught `git commit --fixup` to create "amend!" commit. Let's also update the tests and use it to setup the rebase tests. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:29:36 -07:00
Charvi Mendiratta	3d1bda6b5b	t7500: add tests for --fixup=[amend\|reword] options Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:29:35 -07:00
Charvi Mendiratta	3270ae82ac	commit: add a reword suboption to --fixup `git commit --fixup=reword:<commit>` aliases `--fixup=amend:<commit> --only`, where it creates an empty "amend!" commit that will reword <commit> without changing its contents when it is rebased with `--autosquash`. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:29:35 -07:00
Charvi Mendiratta	494d314a05	commit: add amend suboption to --fixup to create amend! commit `git commit --fixup=amend:<commit>` will create an "amend!" commit. The resulting commit message subject will be "amend! ..." where "..." is the subject line of <commit> and the initial message body will be <commit>'s message. The "amend!" commit when rebased with --autosquash will fixup the contents and replace the commit message of <commit> with the "amend!" commit's message body. In order to prevent rebase from creating commits with an empty message we refuse to create an "amend!" commit if commit message body is empty. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:29:35 -07:00
Charvi Mendiratta	6e0e288779	sequencer: export and rename subject_length() This function can be used in other parts of git. Let's move the function to commit.c and also rename it to make the name of the function more generic. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:29:35 -07:00
Junio C Hamano	a5828ae6b5	Git 2.31 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 11:51:51 -07:00
Junio C Hamano	8775279891	Merge branch 'jn/mergetool-hideresolved-is-optional' Disable the recent mergetool's hideresolved feature by default for backward compatibility and safety. * jn/mergetool-hideresolved-is-optional: doc: describe mergetool configuration in git-mergetool(1) mergetool: do not enable hideResolved by default	2021-03-14 16:01:41 -07:00
Junio C Hamano	074d162eff	Merge branch 'tb/pack-revindex-on-disk' Fix for a topic in 'master'. * tb/pack-revindex-on-disk: pack-revindex.c: don't close unopened file descriptors	2021-03-14 16:01:41 -07:00
Andrzej Hunt	04fe4d75fa	init-db: silence template_dir leak when converting to absolute path template_dir starts off pointing to either argv or nothing. However if the value supplied in argv is a relative path, absolute_pathdup() is used to turn it into an absolute path. absolute_pathdup() allocates a new string, and we then "leak" it when cmd_init_db() completes. We don't bother to actually free the return value (instead we UNLEAK it), because there's no significant advantage to doing so here. Correctly freeing it would require more significant changes to code flow which would be more noisy than beneficial. Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-14 15:58:00 -07:00
Andrzej Hunt	e4de4502e6	init: remove git_init_db_config() while fixing leaks The primary goal of this change is to stop leaking init_db_template_dir. This leak can happen because: 1. git_init_db_config() allocates new memory into init_db_template_dir without first freeing the existing value. 2. init_db_template_dir might already contain data, either because: 2.1 git_config() can be invoked twice with this callback in a single process - at least 2 allocations are likely. 2.2 A single git_config() allocation can invoke the callback multiple times for a given key (see further explanation in the function docs) - each of those calls will trigger another leak. The simplest fix for the leak would be to free(init_db_template_dir) before overwriting it. Instead we choose to convert to fetching init.templatedir via git_config_get_value() as that is more explicit, more efficient, and avoids allocations (the returned result is owned by the config cache, so we aren't responsible for freeing it). If we remove init_db_template_dir, git_init_db_config() ends up being responsible only for forwarding core.* config values to platform_core_config(). However platform_core_config() already ignores non-core.* config values, so we can safely remove git_init_db_config() and invoke git_config() directly with platform_core_config() as the callback. The platform_core_config forwarding was originally added in: `287853392a` (mingw: respect core.hidedotfiles = false in git-init again, 2019-03-11 And I suspect the potential for a leak existed since the original implementation of git_init_db_config in: `90b45187ba` (Add `init.templatedir` configuration variable., 2010-02-17) LSAN output from t0001: Direct leak of 73 byte(s) in 1 object(s) allocated from: #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9a7276 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8 #2 0x9362ad in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2 #3 0x936eaa in strbuf_add /home/ahunt/oss-fuzz/git/strbuf.c:295:2 #4 0x868112 in strbuf_addstr /home/ahunt/oss-fuzz/git/./strbuf.h:304:2 #5 0x86a8ad in expand_user_path /home/ahunt/oss-fuzz/git/path.c:758:2 #6 0x720bb1 in git_config_pathname /home/ahunt/oss-fuzz/git/config.c:1287:10 #7 0x5960e2 in git_init_db_config /home/ahunt/oss-fuzz/git/builtin/init-db.c:161:11 #8 0x7255b8 in configset_iter /home/ahunt/oss-fuzz/git/config.c:1982:7 #9 0x7253fc in repo_config /home/ahunt/oss-fuzz/git/config.c:2311:2 #10 0x725ca7 in git_config /home/ahunt/oss-fuzz/git/config.c:2399:2 #11 0x593e8d in create_default_files /home/ahunt/oss-fuzz/git/builtin/init-db.c:225:2 #12 0x5935c6 in init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:449:11 #13 0x59588e in cmd_init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:714:9 #14 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #15 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #16 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #17 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #18 0x69c4de in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #19 0x7f23552d6349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-14 15:57:59 -07:00
Andrzej Hunt	aa1b63971a	worktree: fix leak in dwim_branch() Make sure that we release the temporary strbuf during dwim_branch() for all codepaths (and not just for the early return). This leak appears to have been introduced in: `f60a7b763f` (worktree: teach "add" to check out existing branches, 2018-04-24) Note that UNLEAK(branchname) is still needed: the returned result is used in add(), and is stored in a pointer which is used to point at one of: - a string literal ("HEAD") - member of argv (whatever the user specified in their invocation) - or our newly allocated string returned from dwim_branch() Fixing the branchname leak isn't impossible, but does not seem worthwhile given that add() is called directly from cmd_main(), and cmd_main() returns immediately thereafter - UNLEAK is good enough. This leak was found when running t0001 with LSAN, see also LSAN output below: Direct leak of 60 byte(s) in 1 object(s) allocated from: #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9ab076 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8 #2 0x939fcd in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2 #3 0x93af53 in strbuf_splice /home/ahunt/oss-fuzz/git/strbuf.c:239:3 #4 0x83559a in strbuf_check_branch_ref /home/ahunt/oss-fuzz/git/object-name.c:1593:2 #5 0x6988b9 in dwim_branch /home/ahunt/oss-fuzz/git/builtin/worktree.c:454:20 #6 0x695f8f in add /home/ahunt/oss-fuzz/git/builtin/worktree.c:525:19 #7 0x694a04 in cmd_worktree /home/ahunt/oss-fuzz/git/builtin/worktree.c:1036:10 #8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #12 0x69caee in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #13 0x7f7b7dd10349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-14 15:57:59 -07:00
Andrzej Hunt	0c4542738e	clone: free or UNLEAK further pointers when finished Most of these pointers can safely be freed when cmd_clone() completes, therefore we make sure to free them. The one exception is that we have to UNLEAK(repo) because it can point either to argv[0], or a malloc'd string returned by absolute_pathdup(). We also have to free(path) in the middle of cmd_clone(): later during cmd_clone(), path is unconditionally overwritten with a different path, triggering a leak. Freeing the first path immediately after use (but only in the case where it contains data) seems like the cleanest solution, as opposed to freeing it unconditionally before path is reused for another path. This leak appears to have been introduced in: `f38aa83f9a` (use local cloning if insteadOf makes a local URL, 2014-07-17) These leaks were found when running t0001 with LSAN, see also an excerpt of the LSAN output below (the full list is omitted because it's far too long, and mostly consists of indirect leakage of members of the refs we are freeing). Direct leak of 178 byte(s) in 1 object(s) allocated from: #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8 #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9 #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8 #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10 #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 165 byte(s) in 1 object(s) allocated from: #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9a6fc4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8 #2 0x9a6f9a in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9 #3 0x8ce266 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8 #4 0x51e9bd in wanted_peer_refs /home/ahunt/oss-fuzz/git/builtin/clone.c:574:21 #5 0x51cfe1 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1284:17 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c42e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f8fef0c2349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 178 byte(s) in 1 object(s) allocated from: #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8 #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9 #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8 #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10 #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 165 byte(s) in 1 object(s) allocated from: #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8 #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20 #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9 #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8 #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8 #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4 #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9 #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4 #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9 #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 105 byte(s) in 1 object(s) allocated from: #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9a71f6 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8 #2 0x93622d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2 #3 0x937a73 in strbuf_addch /home/ahunt/oss-fuzz/git/./strbuf.h:231:3 #4 0x939fcd in strbuf_add_absolute_path /home/ahunt/oss-fuzz/git/strbuf.c:911:4 #5 0x69d3ce in absolute_pathdup /home/ahunt/oss-fuzz/git/abspath.c:261:2 #6 0x51c688 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1021:10 #7 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #8 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #9 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #10 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #11 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #12 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-14 15:57:59 -07:00
Andrzej Hunt	e901de6816	reset: free instead of leaking unneeded ref dwim_ref() allocs a new string into ref. Instead of setting to NULL to discard it, we can FREE_AND_NULL. This leak appears to have been introduced in: `4cf76f6bbf` (builtin/reset: compute checkout metadata for reset, 2020-03-16) This leak was found when running t0001 with LSAN, see also LSAN output below: Direct leak of 5 byte(s) in 1 object(s) allocated from: #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3 #1 0x9a7108 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14 #2 0x8add6b in expand_ref /home/ahunt/oss-fuzz/git/refs.c:670:12 #3 0x8ad777 in repo_dwim_ref /home/ahunt/oss-fuzz/git/refs.c:644:22 #4 0x6394af in dwim_ref /home/ahunt/oss-fuzz/git/./refs.h:162:9 #5 0x637e5c in cmd_reset /home/ahunt/oss-fuzz/git/builtin/reset.c:426:4 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c5ce in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f57ebb9d349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-14 15:57:59 -07:00
Andrzej Hunt	f63b88867a	symbolic-ref: don't leak shortened refname in check_symref() shorten_unambiguous_ref() returns an allocated string. We have to track it separately from the const refname. This leak has existed since: `9ab55daa55` (git symbolic-ref --delete $symref, 2012-10-21) This leak was found when running t0001 with LSAN, see also LSAN output below: Direct leak of 19 byte(s) in 1 object(s) allocated from: #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3 #1 0x9ab048 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14 #2 0x8b452f in refs_shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c #3 0x8b47e8 in shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c:1287:9 #4 0x679fce in check_symref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:28:14 #5 0x679ad8 in cmd_symbolic_ref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:70:9 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69cc6e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f98388a4349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-14 15:57:59 -07:00
Junio C Hamano	5be1c70518	Merge tag 'l10n-2.31.0-rnd2' of git://github.com/git-l10n/git-po l10n for Git 2.31.0 round 2 * tag 'l10n-2.31.0-rnd2' of git://github.com/git-l10n/git-po: l10n: zh_CN: for git v2.31.0 l10n round 1 and 2 l10n: de.po: Update German translation for Git v2.31.0 l10n: pt_PT: add Portuguese translations part 1 l10n: vi.po(5104t): for git v2.31.0 l10n round 2 l10n: es: 2.31.0 round 2 l10n: Add translation team info l10n: start Indonesian translation l10n: zh_TW.po: v2.31.0 round 2 (15 untranslated) l10n: bg.po: Updated Bulgarian translation (5104t) l10n: fr: v2.31 rnd 2 l10n: tr: v2.31.0-rc1 l10n: sv.po: Update Swedish translation (5104t0f0u) l10n: git.pot: v2.31.0 round 2 (9 new, 8 removed) l10n: tr: v2.31.0-rc0 l10n: sv.po: Update Swedish translation (5103t0f0u) l10n: pl.po: Update translation l10n: fr: v2.31.0 rnd 1 l10n: git.pot: v2.31.0 round 1 (155 new, 89 removed) l10n: Update Catalan translation l10n: ru.po: update Russian translation	2021-03-14 15:50:36 -07:00
René Scharfe	8588aa8657	vcs-svn: remove header files as well `fc47391e24` (drop vcs-svn experiment, 2020-08-13) removed most vcs-svn files. Drop the remaining header files as well, as they are no longer used. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-14 15:48:23 -07:00
Jiang Xin	473eb54151	l10n: zh_CN: for git v2.31.0 l10n round 1 and 2 Translate 161 new messages (5104t0f0u) for git 2.31.0. Signed-off-by: Jiang Xin <worldhello.net@gmail.com>	2021-03-15 00:05:25 +08:00
Jiang Xin	4bc948a743	Merge branch 'master' of github.com:vnwildman/git * 'master' of github.com:vnwildman/git: l10n: vi.po(5104t): for git v2.31.0 l10n round 2	2021-03-15 00:04:47 +08:00
Jiang Xin	e196890735	Merge branch 'l10n/zh_TW/210301' of github.com:l10n-tw/git-po * 'l10n/zh_TW/210301' of github.com:l10n-tw/git-po: l10n: zh_TW.po: v2.31.0 round 2 (15 untranslated)	2021-03-14 22:35:44 +08:00
Jiang Xin	84bc81478e	Merge branch 'po-id' of github.com:bagasme/git-po * 'po-id' of github.com:bagasme/git-po: l10n: Add translation team info l10n: start Indonesian translation	2021-03-14 22:35:17 +08:00
Jiang Xin	bd5fba827b	Merge branch 'master' of github.com:Softcatala/git-po * 'master' of github.com:Softcatala/git-po: l10n: Update Catalan translation	2021-03-14 22:34:46 +08:00
Jiang Xin	2d897529b2	Merge branch 'russian-l10n' of github.com:DJm00n/git-po-ru * 'russian-l10n' of github.com:DJm00n/git-po-ru: l10n: ru.po: update Russian translation	2021-03-14 22:34:12 +08:00
Jiang Xin	799df2e406	Merge branch 'pt-PT' of github.com:git-l10n-pt-PT/git-po * 'pt-PT' of github.com:git-l10n-pt-PT/git-po: l10n: pt_PT: add Portuguese translations part 1	2021-03-14 22:33:26 +08:00
René Scharfe	ca56dadb4b	use CALLOC_ARRAY Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 16:00:09 -08:00
René Scharfe	f1121499e6	git-compat-util.h: drop trailing semicolon from macro definition Make CALLOC_ARRAY usable like a function by requiring callers to supply the trailing semicolon, which all of the current ones already do. With the extra semicolon e.g. the following code wouldn't compile because it disconnects the "else" from the "if": if (condition) CALLOC_ARRAY(ptr, n); else whatever(); Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 15:56:13 -08:00
Taylor Blau	4c8e3dca6e	Documentation/git-push.txt: correct configuration typo In the EXAMPLES section, git-push(1) says that 'git push origin' pushes the current branch to the value of the 'remote.origin.merge' configuration. This wording (which dates back to `b2ed944af7` (push: switch default from "matching" to "simple", 2013-01-04)) is incorrect. There is no such configuration as 'remote.<name>.merge'. This likely was originally intended to read "branch.<name>.merge" instead. Indeed, when 'push.default' is 'simple' (which is the default value, and is applicable in this scenario per "without additional configuration"), setup_push_upstream() dies if the branch's local name does not match 'branch.<name>.merge'. Correct this long-standing typo to resolve some recent confusion on the intended behavior of this example. Reported-by: Adam Sharafeddine <adam.shrfdn@gmail.com> Reported-by: Fabien Terrani <terranifabien@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 15:41:45 -08:00
Jonathan Nieder	53204061ac	doc: describe mergetool configuration in git-mergetool(1) In particular, this describes mergetool.hideResolved, which can help users discover this setting (either because it may be useful to them or in order to understand mergetool's behavior if they have forgotten setting it in the past). Tested by running make -C Documentation git-mergetool.1 man Documentation/git-mergetool.1 and reading through the page. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 15:34:32 -08:00
Jonathan Nieder	b2a51c1b03	mergetool: do not enable hideResolved by default When `98ea309b3f` (mergetool: add hideResolved configuration, 2021-02-09) introduced the mergetool.hideResolved setting to reduce the clutter in viewing non-conflicted sections of files in a mergetool, it enabled it by default, explaining: No adverse effects were noted in a small survey of popular mergetools[1] so this behavior defaults to `true`. In practice, alas, adverse effects do appear. A few issues: 1. No indication is shown in the UI that the base, local, and remote versions shown have been modified by additional resolution. This is inherent in the design: the idea of mergetool.hideResolved is to convince a mergetool that expects pristine local, base, and remote files to show partially resolved verisons of those files instead; there is no additional source of information accessible to the mergetool to see where the resolution has happened. (By contrast, a mergetool generating the partial resolution from conflict markers for itself would be able to hilight the resolved sections with a different color.) A user accustomed to seeing the files without partial resolution gets no indication that this behavior has changed when they upgrade Git. 2. If the computed merge did not line up the files correctly (for example due to repeated sections in the file), the partially resolved files can be misleading and do not have enough information to reconstruct what happened and compute the correct merge result. 3. Resolving a conflict can involve information beyond the textual conflict. For example, if the local and remote versions added overlapping functionality in different ways, seeing the full unresolved versions of each alongside the base gives information about each side's intent that makes it possible to come up with a resolution that combines those two intents. By contrast, when starting with partially resolved versions of those files, one can produce a subtly wrong resolution that includes redundant extra code added by one side that is not needed in the approach taken on the other. All that said, a user wanting to focus on textual conflicts with reduced clutter can still benefit from mergetool.hideResolved=true as a way to deemphasize sections of the code that resolve cleanly without requiring any changes to the invoked mergetool. The caveats described above are reduced when the user has explicitly turned this on, because then the user is aware of them. Flip the default to 'false'. Reported-by: Dana Dahlstrom <dahlstrom@google.com> Helped-by: Seth House <seth@eseth.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 15:30:29 -08:00
John Szakmeister	a4a4439fdf	http: drop the check for an empty proxy password before approving credential_approve() already checks for a non-empty password before saving, so there's no need to do the extra check here. Signed-off-by: John Szakmeister <john@szakmeister.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-11 22:17:10 -08:00
John Szakmeister	cd27f604e4	http: store credential when PKI auth is used We already looked for the PKI credentials in the credential store, but failed to approve it on success. Meaning, the PKI certificate password was never stored and git would request it on every connection to the remote. Let's complete the chain by storing the certificate password on success. Likewise, we also need to reject the credential when there is a failure. Curl appears to report client-related certificate issues are reported with the CURLE_SSL_CERTPROBLEM error. This includes not only a bad password, but potentially other client certificate related problems. Since we cannot get more information from curl, we'll go ahead and reject the credential upon receiving that error, just to be safe and avoid caching or saving a bad password. Signed-off-by: John Szakmeister <john@szakmeister.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-11 22:17:07 -08:00
René Scharfe	96099726dd	archive: expand only a single %(describe) per archive Every %(describe) placeholder in $Format:...$ strings in files with the attribute export-subst is expanded by calling git describe. This can potentially result in a lot of such calls per archive. That's OK for local repositories under control of the user of git archive, but could be a problem for hosted repositories. Expand only a single %(describe) placeholder per archive for now to avoid denial-of-service attacks. We can make this limit configurable later if needed, but let's start out simple. Reported-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-11 13:22:44 -08:00
Elijah Newren	e4fd06e7e2	diffcore-rename: avoid doing basename comparisons for irrelevant sources The basename comparison optimization implemented in find_basename_matches() is very beneficial since it allows a source to sometimes only be compared with one other file instead of N other files. When a match is found, both a source and destination can be removed from the matrix of inexact rename comparisons. In contrast, the irrelevant source optimization only allows us to remove a source from the matrix of inexact rename comparisons...but it has the advantage of allowing a source file to not even be loaded into memory at all and be compared to 0 other files. Generally, not even comparing is a bigger performance win, so when both optimizations could apply, prefer to use the irrelevant-source optimization. For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 5.708 s ± 0.111 s 5.680 s ± 0.096 s mega-renames: 102.171 s ± 0.440 s 13.812 s ± 0.162 s just-one-mega: 3.471 s ± 0.015 s 506.0 ms ± 3.9 ms Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 22:18:05 -08:00
Elijah Newren	f89b4f2bee	merge-ort: skip rename detection entirely if possible diffcore_rename_extended() will do a bunch of setup, then check for exact renames, then abort before inexact rename detection if there are no more sources or destinations that need to be matched. It will sometimes be the case, however, that either * we start with neither any sources or destinations * we start with no relevant sources In the first of these two cases, the setup and exact rename detection will be very cheap since there are 0 files to operate on. In the second case, it is quite possible to have thousands of files with none of the source ones being relevant. Avoid calling diffcore_rename_extended() or even some of the setup before diffcore_rename_extended() when we can determine that rename detection is unnecessary. For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 6.003 s ± 0.048 s 5.708 s ± 0.111 s mega-renames: 114.009 s ± 0.236 s 102.171 s ± 0.440 s just-one-mega: 3.489 s ± 0.017 s 3.471 s ± 0.015 s Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 22:18:05 -08:00
Elijah Newren	174791f0fb	merge-ort: use relevant_sources to filter possible rename sources The past several commits determined conditions when rename sources might be needed, and filled a relevant_sources strset with those paths. Pass these along to diffcore_rename_extended() to use to limit the sources that we need to detect renames for. For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 12.596 s ± 0.061 s 6.003 s ± 0.048 s mega-renames: 130.465 s ± 0.259 s 114.009 s ± 0.236 s just-one-mega: 3.958 s ± 0.010 s 3.489 s ± 0.017 s Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 22:18:05 -08:00
Elijah Newren	2fd9eda462	merge-ort: precompute whether directory rename detection is needed The point of directory rename detection is that if one side of history renames a directory, and the other side adds new files under the old directory, then the merge can move those new files into the new directory. This leads to the following important observation: * If the other side does not add any new files under the old directory, we do not need to detect any renames for that directory. Similarly, directory rename detection had an important requirement: * If a directory still exists on one side of history, it has not been renamed on that side of history. (See section 4 of t6423 or Documentation/technical/directory-rename-detection.txt for more details). Using these two bits of information, we note that directory rename detection is only needed in cases where (1) directories exist in the merge base and on one side of history (i.e. dirmask == 3 or dirmask == 5), and (2) where there is some new file added to that directory on the side where it still exists (thus where the file has filemask == 2 or filemask == 4, respectively). This has to be done in two steps, because we have the dirmask when we are first considering the directory, and won't get the filemasks for the files within it until we recurse into that directory. So, we save dir_rename_mask = dirmask - 1 when we hit a directory that is missing on one side, and then later look for cases of filemask == dir_rename_mask One final note is that as soon as we hit a directory that needs directory rename detection, we will need to detect renames in all subdirectories of that directory as well due to the "majority rules" decision when files are renamed into different directory hierarchies. We arbitrarily use the special value of 0x07 to record when we've hit such a directory. The combination of all the above mean that we introduce a variable named dir_rename_mask (couldn't think of a better name) which has one of the following values as we traverse into a directory: * 0x00: directory rename detection not needed * 0x02 or 0x04: directory rename detection only needed if files added * 0x07: directory rename detection definitely needed We then pass this value through to add_pairs() so that it can mark location_relevant as true only when dir_rename_mask is 0x07. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 22:18:05 -08:00
Elijah Newren	a68e6cea59	merge-ort: introduce wrappers for alternate tree traversal Add traverse_trees_wrapper() and traverse_trees_wrapper_callback() functions. The former runs traverse_trees() with info->fn set to traverse_trees_wrapper_callback, in order to simply save all the entries without processing or recursing into any of them. This step allows extra computation to be done (e.g. checking some condition across all files) that can be used later. Then, after that is completed, it iterates over all the saved entries and calls the original info->fn callback with the saved data. Currently, this does nothing more than marginally slowing down the tree traversal since we do not take advantage of the opportunity to compute anything special in traverse_trees_wrapper_callback(), and thus the real callback will be called identically as it would have been without this extra wrapper. However, a subsequent commit will add some special computation of some values that the real callback will be able to use. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 22:18:05 -08:00
Elijah Newren	beb06145f8	merge-ort: add data structures for an alternate tree traversal In order to determine whether directory rename detection is needed, we as a pre-requisite need a way to traverse through all the files in a given tree before visiting any directories within that tree. traverse_trees() only iterates through the entries in the order they appear, so add some data structures that will store all the entries as we iterate through them in traverse_trees(), which will allow us to re-traverse them in our desired order. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 22:18:04 -08:00
Elijah Newren	32a56dfb99	merge-ort: precompute subset of sources for which we need rename detection rename detection works by trying to pair all file deletions (or "sources") with all file additions (or "destinations"), checking similarity, and then marking the sufficiently similar ones as renames. This can be expensive if there are many sources and destinations on a given side of history as it results in an N x M comparison matrix. However, there are many cases where we can compute in advance that detecting renames for some of the sources provides no useful information and thus that we can exclude those sources from the matrix. To see why, first note that the merge machinery uses detected renames in two ways: * directory rename detection: when one side of history renames a directory, and the other side of history adds new files to that directory, we want to be able to warn the user about the need to chose whether those new files stay in the old directory or move to the new one. * three-way content merging: in order to do three-way content merging of files, we need three different file versions. If one side of history renamed a file, then some of the content for the file is found under a different path than in the merge base or on the other side of history. Add a simple testcase showing the two kinds of reasons renames are relevant; it's a testcase that will only pass if we detect both kinds of needed renames. Other than the testcase added above, this commit concentrates just on the three-way content merging; it will punt and mark all sources as needed for directory rename detection, and leave it to future commits to narrow that down more. The point of three-way content merging is to reconcile changes made on both sides of history. What if the file wasn't modified on both sides? There are two possibilities: * If it wasn't modified on the renamed side: -> then we get to do exact rename detection, which is cheap. * If it wasn't modified on the unrenamed side: -> then detection of a rename for that source file is irrelevant That latter claim might be surprising at first, so let's walk through a case to show why rename detection for that source file is irrelevant. Let's use two filenames, old.c & new.c, with the following abbreviated object ids (and where the value '000000' is used to denote that the file is missing in that commit): old.c new.c MERGE_BASE: 01d01d 000000 MERGE_SIDE1: 01d01d 000000 MERGE_SIDE2: 000000 5e1ec7 If the rename isn't detected: then old.c looks like it was unmodified on one side and deleted on the other and should thus be removed. new.c looks like a new file we should keep as-is. If the rename is detected: then a three-way content merge is done. Since the version of the file in MERGE_BASE and MERGE_SIDE1 are identical, the three-way merge will produce exactly the version of the file whose abbreviated object id is 5e1ec7. It will record that file at the path new.c, while removing old.c from the directory. Note that these two results are identical -- a single file named 'new.c' with object id 5e1ec7. In other words, it doesn't matter if the rename is detected in the case where the file is unmodified on the unrenamed side. Use this information to compute whether we need rename detection for each source created in add_pair(). It's probably worth noting that there used to be a few other edge or corner cases besides three-way content merges and directory rename detection where lack of rename detection could have affected the result, but those cases actually highlighted where conflict resolution methods were not consistent with each other. Fixing those inconsistencies were thus critically important to enabling this optimization. That work involved the following: * bringing consistency to add/add, rename/add, and rename/rename conflict types, as done back in the topic merged at commit `ac193e0e0a` ("Merge branch 'en/merge-path-collision'", 2019-01-04), and further extended in commits `2a7c16c980` ("t6422, t6426: be more flexible for add/add conflicts involving renames", 2020-08-10) and `e8eb99d4a6` ("t642[23]: be more flexible for add/add conflicts involving pair renames", 2020-08-10) * making rename/delete more consistent with modify/delete as done in commits `1f3c9ba707` ("t6425: be more flexible with rename/delete conflict messages", 2020-08-10) and `727c75b23f` ("t6404, t6423: expect improved rename/delete handling in ort backend", 2020-10-26) Since the set of relevant_sources we compute has not yet been narrowed down for directory rename detection, we do not pass it to diffcore_rename_extended() yet. That will be done after subsequent commits narrow down the list of relevant_sources needed for directory rename detection reasons. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 22:18:04 -08:00
Elijah Newren	9799889f2e	diffcore-rename: enable filtering possible rename sources Add the ability to diffcore_rename_extended() to allow external callers to declare that they only need renames detected for a subset of source files, and use that information to skip detecting renames for them. There are two important pieces to this optimization that may not be obvious at first glance: * We do not require callers to just filter the filepairs out to remove the non-relevant sources, because exact rename detection is fast and when it finds a match it can remove both a source and a destination whereas the relevant_sources filter can only remove a source. * We need to filter out the source pairs in a preliminary pass instead of adding a strset_contains(relevant_sources, one->path) check within the nested matrix loop. The reason for that is if we have 30k renames, doing 30k * 30k = 900M strset_contains() calls becomes extraordinarily expensive and defeats the performance gains from this change; we only want to do 30k such calls instead. If callers pass NULL for relevant_sources, that is special cases to treat all sources as relevant. Since all callers currently pass NULL, this optimization does not yet have any effect. Subsequent commits will have merge-ort compute a set of relevant_sources to restrict which sources we detect renames for, and have merge-ort pass that set of relevant_sources to diffcore_rename_extended(). A note about filtering order: Some may be curious why we don't filter out irrelevant sources at the same time we filter out exact renames. While that technically could be done at this point, there are two reasons to defer it: First, was to reinforce a lesson that was too easy to forget. As I mentioned above, in the past I filtered irrelevant sources out before exact rename checking, and then discovered that exact renames' ability to remove both sources and destinations was an important consideration and thus doing the filtering after exact rename checking would speed things up. Then at some point I realized that basename matching could also remove both sources and destinations, and decided to put irrelevant source filtering after basename filtering. That slowed things down a lot. But, despite learning about this important ordering, in later restructuring I forgot and made the same mistake of putting the filtering after basename guided rename detection again. So, I have this series of patches structured to do the irrelevant filtering last to start to show how much extra it costs, and then add relevant filtering in to find_basename_matches() to show how much it speeds things up. Basically, it's a way to reinforce something that apparently was too easy to forget, and make sure the commit messages record this lesson. Second, the items in the "relevant_sources" in this patch series will include all sources that might be relevant. It has to be conservative and catch anything that might need a rename, but in the patch series after this one we'll find ways to weed out more of the might be relevant ones. Unfortunately, merge-ort does not have sufficient information to weed those ones out, and there isn't enough information at the time of filtering exact renames out to remove the extra ones either. It has to be deferred. So the deferral is in part to simplify some later additions. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 22:18:04 -08:00
brian m. carlson	75555676ad	builtin/init-db: handle bare clones when core.bare set to false In `552955ed7f` ("clone: use more conventional config/option layering", 2020-10-01), clone learned to read configuration options earlier in its execution, before creating the new repository. However, that led to a problem: if the core.bare setting is set to false in the global config, cloning a bare repository segfaults. This happens because the repository is falsely thought to be non-bare, but clone has set the work tree to NULL, which is then dereferenced. The code to initialize the repository already considers the fact that a user might want to override the --bare option for git init, but it doesn't take into account clone, which uses a different option. Let's just check that the work tree is not NULL, since that's how clone indicates that the repository is bare. This is also the case for git init, so we won't be regressing that case. Reported-by: Joseph Vusich <jvusich@amazon.com> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 15:06:48 -08:00
Jeff King	42efa1231a	filter-branch: drop $_x40 glob When checking whether a commit was rewritten to a single object id, we use a glob that insists on a 40-hex result. This works for sha1, but fails t7003 when run with GIT_TEST_DEFAULT_HASH=sha256. Since the previous commit simplified the case statement here, we only have two arms: an empty string or a single object id. We can just loosen our glob to match anything, and still distinguish those cases (we lose the ability to notice bogus input, but that's not a problem; we are the one who wrote the map in the first place, and anyway update-ref will complain loudly if the input isn't a valid hash). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 14:16:58 -08:00
Jeff King	98fe9e666f	filter-branch: drop multiple-ancestor warning When a ref maps to a commit that is neither rewritten nor kept by filter-branch (e.g., because it was eliminated by rev-list's pathspec selection), we rewrite it to its nearest ancestor. Since the initial commit in `6f6826c52b` (Add git-filter-branch, 2007-06-03), we have warned when there are multiple such ancestors in the map file. However, the warning code is impossible to trigger these days. Since `a0e46390d3` (filter-branch: fix ref rewriting with --subdirectory-filter, 2008-08-12), we find the ancestor using "rev-list -1", so it can only ever have a single value. This code is made doubly confusing by the fact that we append to the map file when mapping ancestors. However, this can never yield multiple values because: - we explicitly check whether the map already exists, and if so, do nothing (so our "append" will always be to a file that does not exist) - even if we were to try mapping twice, the process to do so is deterministic. I.e., we'd always end up with the same ancestor for a given sha1. So warning about it would be pointless; there is no ambiguity. So swap out the warning code for a BUG (which we'll simplify further in the next commit). And let's stop using the append operator to make the ancestor-mapping code less confusing. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 14:14:52 -08:00
Jeff King	6d875d19fd	t7003: test ref rewriting explicitly After it has rewritten all of the commits, filter-branch will then rewrite each of the input refs based on the resulting map of old/new commits. But we don't have any explicit test coverage of this code. Let's make sure we are covering each of those cases: - deleting a ref when all of its commits were pruned - rewriting a ref based on the mapping (this happens throughout the script, but let's make sure we generate the correct messages) - rewriting a ref whose tip was excluded, in which case we rewrite to the nearest ancestor. Note in this case that we still insist that no "warning" line is present (even though it looks like we'd trigger the "... was rewritten into multiple commits" one). See the next commit for more details. Note these all pass currently, but the latter two will fail when run with GIT_TEST_DEFAULT_HASH=sha256. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 14:14:19 -08:00
Junio C Hamano	13d7ab6b5d	Git 2.31-rc2	2021-03-08 16:09:43 -08:00
Junio C Hamano	56a57652ef	Sync with Git 2.30.2 for CVE-2021-21300 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-08 16:09:07 -08:00
Junio C Hamano	6c46f864e5	Merge branch 'jt/transfer-fsck-across-packs-fix' The code to fsck objects received across multiple packs during a single git fetch session has been broken when the packfile URI feature was in use. A workaround has been added by disabling the codepath to avoid keeping a packfile that is too small. * jt/transfer-fsck-across-packs-fix: fetch-pack: do not mix --pack_header and packfile uri	2021-03-08 16:04:47 -08:00
Matthias Rüster	834845142d	l10n: de.po: Update German translation for Git v2.31.0 Reviewed-by: Ralf Thielow <ralf.thielow@gmail.com> Reviewed-by: Phillip Szelat <phillip.szelat@gmail.com> Signed-off-by: Matthias Rüster <matthias.ruester@gmail.com>	2021-03-08 19:49:33 +01:00
Andrzej Hunt	68b5c3aa48	Makefile: update 'make fuzz-all' docs to reflect modern clang Clang no longer produces a libFuzzer.a. Instead, you can include libFuzzer by using -fsanitize=fuzzer. Therefore we should use that in the example command for building fuzzers. We also add -fsanitize=fuzzer-no-link to the CFLAGS to ensure that all the required instrumentation is added when compiling git [1], and remove -fsanitize-coverage=trace-pc-guard as it is deprecated. I happen to have tested with LLVM 11 - however -fsanitize=fuzzer appears to work in a wide range of reasonably modern clangs. (On my system: what used to be libFuzzer.a now lives under the following path, which is tricky albeit not impossible for a novice such as myself to find: /usr/lib64/clang/11.0.0/lib/linux/libclang_rt.fuzzer-x86_64.a ) [1] https://releases.llvm.org/11.0.0/docs/LibFuzzer.html#fuzzer-usage Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-08 10:26:25 -08:00
Ramkumar Ramachandra	e8df3b6c6c	Add entry for Ramkumar Ramachandra Signed-off-by: Ramkumar Ramachandra <r@artagnon.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-08 09:56:34 -08:00
René Scharfe	241b5d3ebe	fix xcalloc() argument order Pass the number of elements first and ther size second, as expected by xcalloc(). Provide a semantic patch, which was actually used to generate the rest of this patch. The semantic patch would generate flip-flop diffs if both arguments are sizeofs. We don't have such a case, and it's hard to imagine the usefulness of such an allocation. If it ever occurs then we could deal with it by duplicating the rule in the semantic patch to make it cancel itself out, or we could change the code to use CALLOC_ARRAY. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-08 09:45:04 -08:00
Daniel Santos	408985d301	l10n: pt_PT: add Portuguese translations part 1 * Newlines corrected. * Add concept translation table. * Translated some. * Corrected some. * Corrected some 'Negation of Emptiness'. Signed-off-by: Daniel Santos <hello@brighterdan.com>	2021-03-08 15:21:51 +00:00
Tran Ngoc Quan	1369935987	l10n: vi.po(5104t): for git v2.31.0 l10n round 2 Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>	2021-03-08 09:03:04 +07:00
Christopher Diaz Riveros	b0adcc311b	l10n: es: 2.31.0 round 2 Signed-off-by: Christopher Diaz Riveros <christopher.diaz.riv@gmail.com>	2021-03-07 18:31:14 -05:00
Bagas Sanjaya	c21ad4d941	l10n: Add translation team info Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>	2021-03-07 19:38:29 +07:00
Bagas Sanjaya	8c4abfb8be	l10n: start Indonesian translation * Initialize PO file * Translate init-db.c * Translate wt-status.c * Translate builtin/clone.c * Translate builtin/checkout.c * Translate builtin/fetch.c * Complete core translations: * builtin/remote.c * builtin/index-pack.c * push.c * reset.c * Sync with l10n upstream Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>	2021-03-07 19:38:07 +07:00
Jonathan Tan	2aec3bc4b6	fetch-pack: do not mix --pack_header and packfile uri When fetching (as opposed to cloning) from a repository with packfile URIs enabled, an error like this may occur: fatal: pack has bad object at offset 12: unknown object type 5 fatal: finish_http_pack_request gave result -1 fatal: fetch-pack: expected keep then TAB at start of http-fetch output This bug was introduced in `b664e9ffa1` ("fetch-pack: with packfile URIs, use index-pack arg", 2021-02-22), when the index-pack args used when processing the inline packfile of a fetch response and when processing packfile URIs were unified. This bug happens because fetch, by default, partially reads (and consumes) the header of the inline packfile to determine if it should store the downloaded objects as a packfile or loose objects, and thus passes --pack_header=<...> to index-pack to inform it that some bytes are missing. However, when it subsequently fetches the additional packfiles linked by URIs, it reuses the same index-pack arguments, thus wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are missing. This does not happen when cloning because "git clone" always passes do_keep, which instructs the fetch mechanism to always retain the packfile, eliminating the need to read the header. There are a few ways to fix this, including filtering out pack_header arguments when downloading the additional packfiles, but I decided to stick to always using index-pack throughout when packfile URIs are present - thus, Git no longer needs to read the bytes, and no longer needs --pack_header here. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-05 15:04:09 -08:00
Denton Liu	0af760e261	stash show: learn stash.showIncludeUntracked The previous commit teaches `git stash show --include-untracked`. It may be desirable for a user to be able to always enable the --include-untracked behavior. Teach the stash.showIncludeUntracked config option which allows users to do this in a similar manner to stash.showPatch. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-05 14:31:27 -08:00
Denton Liu	d3c7bf73bd	stash show: teach --include-untracked and --only-untracked Stash entries can be made with untracked files via `git stash push --include-untracked`. However, because the untracked files are stored in the third parent of the stash entry and not the stash entry itself, running `git stash show` does not include the untracked files as part of the diff. With --include-untracked, untracked paths, which are recorded in the third-parent if it exists, are shown in addition to the paths that have modifications between the stash base and the working tree in the stash. It is possible to manually craft a malformed stash entry where duplicate untracked files in the stash entry will mask tracked files. We detect and error out in that case via a custom unpack_trees() callback: stash_worktree_untracked_merge(). Also, teach stash the --only-untracked option which only shows the untracked files of a stash entry. This is similar to `git show stash^3` but it is nice to provide a convenient abstraction for it so that users do not have to think about the underlying implementation. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-05 14:31:26 -08:00
Junio C Hamano	ccae01cab8	builtin/repack.c: reword comment around pack-objects flags The comment in this block is meant to indicate that passing '--all', '--reflog', and so on aren't necessary when repacking with the '--geometric' option. But, it has two problems: first, it is factually incorrect ('--all' is not incompatible with '--stdin-packs' as the comment suggests); second, it is quite focused on the geometric case for a block that is guarding against it. Reword this comment to address both issues. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-05 11:33:52 -08:00
Taylor Blau	2a15964128	builtin/repack.c: be more conservative with unsigned overflows There are a number of places in the geometric repack code where we multiply the number of objects in a pack by another unsigned value. We trust that the number of objects in a pack is always representable by a uint32_t, but we don't necessarily trust that that number can be multiplied without overflow. Sprinkle some unsigned_add_overflows() and unsigned_mult_overflows() in split_pack_geometry() to check that we never overflow any unsigned types when adding or multiplying them. Arguably these checks are a little too conservative, and certainly they do not help the readability of this function. But they are serving a useful purpose, so I think they are worthwhile overall. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-05 11:33:52 -08:00
Taylor Blau	13d746a303	builtin/repack.c: assign pack split later To determine the where to place the split when repacking with the '--geometric' option, split_pack_geometry() assigns the "split" variable and then decrements it in a loop. It would be equivalent (and more readable) to assign the split to the loop position after exiting the loop, so do that instead. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-05 11:33:52 -08:00
Taylor Blau	dab3247734	t7703: test --geometric repack with loose objects We don't currently have a test that demonstrates the non-idempotent behavior of 'git repack --geometric' with loose objects, so add one here to make sure we don't regress in this area. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-05 11:33:52 -08:00
Taylor Blau	f25e33c156	builtin/repack.c: do not repack single packs with --geometric In `0fabafd0b9` (builtin/repack.c: add '--geometric' option, 2021-02-22), the 'git repack --geometric' code aborts early when there is zero or one pack. When there are no packs, this code does the right thing by placing the split at "0". But when there is exactly one pack, the split is placed at "1", which means that "git repack --geometric" (with any factor) repacks all of the objects in a single pack. This is wasteful, and the remaining code in split_pack_geometry() does the right thing (not repacking the objects in a single pack) even when only one pack is present. Loosen the guard to only stop when there aren't any packs, and let the rest of the code do the right thing. Add a test to ensure that this is the case. Noticed-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-05 11:33:52 -08:00
Yi-Jyun Pan	8278f87022	l10n: zh_TW.po: v2.31.0 round 2 (15 untranslated) Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>	2021-03-06 02:43:34 +08:00
Alexander Shopov	2f176de687	l10n: bg.po: Updated Bulgarian translation (5104t) Signed-off-by: Alexander Shopov <ash@kambanaria.org>	2021-03-05 12:12:34 +01:00
Jiang Xin	1ecef023a9	Merge branch 'fr_next' of github.com:jnavila/git * 'fr_next' of github.com:jnavila/git: l10n: fr: v2.31 rnd 2	2021-03-05 13:47:07 +08:00
Jiang Xin	5b888ad949	Merge branch 'master' of github.com:nafmo/git-l10n-sv * 'master' of github.com:nafmo/git-l10n-sv: l10n: sv.po: Update Swedish translation (5104t0f0u)	2021-03-05 13:46:25 +08:00
Junio C Hamano	be7935ed8b	Merged the open-eintr workaround for macOS Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-04 15:42:50 -08:00
Elijah Newren	58d581c344	Documentation/RelNotes: improve release note for rename detection work There were some early changes in the 2.31 cycle to optimize some setup in diffcore-rename.c[1], some later changes to measure performance[2], and finally some significant changes to improve rename detection performance. The final one was merged with the note Performance optimization work on the rename detection continues. That works for the commit log, but feels misleading as a release note since all the changes were within one cycle. Simplify this to just Performance improvements for rename detection. The former wording could be seen as hinting that more performance improvements will come in 2.32, which is true, but we can just cover those in the 2.32 release notes when the time comes. [1] `a5ac31b5b1` (Merge branch 'en/diffcore-rename', 2021-01-25) [2] `d3a035b055` (Merge branch 'en/merge-ort-perf', 2021-02-11) [3] `12bd17521c` (Merge branch 'en/diffcore-rename', 2021-03-01) Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-04 15:38:11 -08:00
Junio C Hamano	921846fa22	Merge branch 'jk/open-returns-eintr' Work around platforms whose open() is reported to return EINTR (it shouldn't, as we do our signals with SA_RESTART). * jk/open-returns-eintr: config.mak.uname: enable OPEN_RETURNS_EINTR for macOS Big Sur Makefile: add OPEN_RETURNS_EINTR knob	2021-03-04 15:34:45 -08:00
Jean-Noël Avila	068cb92300	l10n: fr: v2.31 rnd 2 Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>	2021-03-04 21:53:45 +01:00
Junio C Hamano	85c787f1e9	Merge https://github.com/prati0100/git-gui * https://github.com/prati0100/git-gui: Revert "git-gui: remove lines starting with the comment character"	2021-03-04 12:38:50 -08:00
Emir Sarı	f6a7e896b8	l10n: tr: v2.31.0-rc1 Signed-off-by: Emir Sarı <bitigchi@me.com>	2021-03-04 22:29:24 +03:00
Peter Krefting	929dc48e96	l10n: sv.po: Update Swedish translation (5104t0f0u) Signed-off-by: Peter Krefting <peter@softwolves.pp.se>	2021-03-04 19:10:43 +01:00
Jiang Xin	9b7e82b940	l10n: git.pot: v2.31.0 round 2 (9 new, 8 removed) Generate po/git.pot from v2.31.0-rc1 for git v2.31.0 l10n round 2. Signed-off-by: Jiang Xin <worldhello.net@gmail.com>	2021-03-04 22:41:21 +08:00
Jiang Xin	4dd8469336	Merge branch 'master' of github.com:git/git * 'master' of github.com:git/git: (63 commits) Git 2.31-rc1 Hopefully the last batch before -rc1 Revert "commit-graph: when incompatible with graphs, indicate why" read-cache: make the index write buffer size 128K dir: fix malloc of root untracked_cache_dir commit-graph.c: display correct number of chunks when writing doc/reftable: document how to handle windows fetch-pack: print and use dangling .gitmodules fetch-pack: with packfile URIs, use index-pack arg http-fetch: allow custom index-pack args http: allow custom index-pack args chunk-format: add technical docs chunk-format: restore duplicate chunk checks midx: use 64-bit multiplication for chunk sizes midx: use chunk-format read API commit-graph: use chunk-format read API chunk-format: create read chunk API midx: use chunk-format API in write_midx_internal() midx: drop chunk progress during write midx: return success/failure in chunk write methods ...	2021-03-04 22:40:13 +08:00
Pratyush Yadav	df4f9e28f6	Merge branch 'py/revert-commit-comments' This commit causes breakage on macOS, or in fact any platform using older versions of Tcl. Revert it. * py/revert-commit-comments: Revert "git-gui: remove lines starting with the comment character"	2021-03-04 13:59:45 +05:30
Pratyush Yadav	c0698df057	Revert "git-gui: remove lines starting with the comment character" This reverts commit `b9a43869c9`. This commit causes breakage on macOS (10.13). It causes errors on startup and completely breaks the commit functionality. There are two main problems. First, it uses `string cat` which is not supported on older Tcl versions. Second, it does a half close of the bidirectional pipe to git-stripspace which is also not supported on older Tcl versions. Reported-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>	2021-03-04 13:53:27 +05:30
Julien Richard	ea7e63921c	doc: .gitignore documentation typofix Signed-off-by: Julien Richard <julien.richard@ubisoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-03 17:16:48 -08:00
Shubham Verma	12604a8d0c	t9801: replace test -f with test_path_is_file Although `test -f` has the same functionality as test_path_is_file(), in the case where test_path_is_file() fails, we get much better debugging information. Replace `test -f` with test_path_is_file so that future developers will have a better experience debugging these test cases. Signed-off-by: Shubham Verma <shubhunic@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-03 17:11:31 -08:00
Torsten Bögershausen	93c3d297b5	git mv foo FOO ; git mv foo bar gave an assert The following sequence, on a case-insensitive file system, (strictly speeking with core.ignorecase=true) leads to an assertion failure and leaves .git/index.lock behind. git init echo foo >foo git add foo git mv foo FOO git mv foo bar This regression was introduced in Commit `9b906af657`, "git-mv: improve error message for conflicted file" The bugfix is to change the "file exist case-insensitive in the index" into the correct "file exist (case-sensitive) in the index". This avoids the "assert" later in the code and keeps setting up the "ce" pointer for ce_stage(ce) done in the next else if. This fixes https://github.com/git-for-windows/git/issues/2920 Reported-By: Dan Moseley <Dan.Moseley@microsoft.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-03 17:07:12 -08:00
Denton Liu	f451960708	git-cat-file.txt: remove references to "sha1" As part of the hash-transition, git can operate on more than just SHA-1 repositories. Replace "sha1"-specific documentation with hash-agnostic terminology. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-03 16:43:06 -08:00
Denton Liu	4f0ba2d533	git-cat-file.txt: monospace args, placeholders and filenames In modern documentation, args, placeholders and filenames are monospaced. Apply monospace formatting to these objects. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-03 16:43:03 -08:00
Junio C Hamano	f01623b2c9	Git 2.31-rc1 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-02 22:41:13 -08:00
Emir Sarı	3ed77c4792	l10n: tr: v2.31.0-rc0 Signed-off-by: Emir Sarı <bitigchi@me.com>	2021-03-02 20:14:46 +08:00
Junio C Hamano	ec125d1bc1	Hopefully the last batch before -rc1 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-01 14:02:58 -08:00
Junio C Hamano	9889cff6d6	Merge branch 'jh/untracked-cache-fix' An under-allocation for the untracked cache data has been corrected. * jh/untracked-cache-fix: dir: fix malloc of root untracked_cache_dir	2021-03-01 14:02:58 -08:00
Junio C Hamano	ada7c5fae5	Merge branch 'ns/raise-write-index-buffer-size' Raise the buffer size used when writing the index file out from (obviously too small) 8kB to (clearly sufficiently large) 128kB. * ns/raise-write-index-buffer-size: read-cache: make the index write buffer size 128K	2021-03-01 14:02:58 -08:00
Junio C Hamano	28714238c8	Merge branch 'hv/trailer-formatting' The logic to handle "trailer" related placeholders in the "--format=" mechanisms in the "log" family and "for-each-ref" family is getting unified. * hv/trailer-formatting: ref-filter: use pretty.c logic for trailers pretty.c: capture invalid trailer argument pretty.c: refactor trailer logic to `format_set_trailers_options()` t6300: use function to test trailer options	2021-03-01 14:02:58 -08:00
Junio C Hamano	18aabfaee5	Merge branch 'hn/reftable-tables-doc-update' Documentation update. * hn/reftable-tables-doc-update: doc/reftable: document how to handle windows	2021-03-01 14:02:57 -08:00
Junio C Hamano	fbad3505ee	Merge branch 'sv/t7001-modernize' Test script modernization. * sv/t7001-modernize: t7001: use `test` rather than `[` t7001: use here-docs instead of echo t7001: put each command on a separate line t7001: use '>' rather than 'touch' t7001: avoid using `cd` outside of subshells t7001: remove whitespace after redirect operators t7001: modernize subshell formatting t7001: remove unnecessary blank lines t7001: indent with TABs instead of spaces t7001: modernize test formatting	2021-03-01 14:02:57 -08:00
Junio C Hamano	6ee353d42f	Merge branch 'jt/transfer-fsck-across-packs' The approach to "fsck" the incoming objects in "index-pack" is attractive for performance reasons (we have them already in core, inflated and ready to be inspected), but fundamentally cannot be applied fully when we receive more than one pack stream, as a tree object in one pack may refer to a blob object in another pack as ".gitmodules", when we want to inspect blobs that are used as ".gitmodules" file, for example. Teach "index-pack" to emit objects that must be inspected later and check them in the calling "fetch-pack" process. * jt/transfer-fsck-across-packs: fetch-pack: print and use dangling .gitmodules fetch-pack: with packfile URIs, use index-pack arg http-fetch: allow custom index-pack args http: allow custom index-pack args	2021-03-01 14:02:57 -08:00
Junio C Hamano	660dd97a62	Merge branch 'ds/chunked-file-api' The common code to deal with "chunked file format" that is shared by the multi-pack-index and commit-graph files have been factored out, to help codepaths for both filetypes to become more robust. * ds/chunked-file-api: commit-graph.c: display correct number of chunks when writing chunk-format: add technical docs chunk-format: restore duplicate chunk checks midx: use 64-bit multiplication for chunk sizes midx: use chunk-format read API commit-graph: use chunk-format read API chunk-format: create read chunk API midx: use chunk-format API in write_midx_internal() midx: drop chunk progress during write midx: return success/failure in chunk write methods midx: add num_large_offsets to write_midx_context midx: add pack_perm to write_midx_context midx: add entries to write_midx_context midx: use context in write_midx_pack_names() midx: rename pack_info to write_midx_context commit-graph: use chunk-format write API chunk-format: create chunk format write API commit-graph: anonymize data in chunk_write_fn	2021-03-01 14:02:57 -08:00
Junio C Hamano	12bd17521c	Merge branch 'en/diffcore-rename' Performance optimization work on the rename detection continues. * en/diffcore-rename: merge-ort: call diffcore_rename() directly gitdiffcore doc: mention new preliminary step for rename detection diffcore-rename: guide inexact rename detection based on basenames diffcore-rename: complete find_basename_matches() diffcore-rename: compute basenames of source and dest candidates t4001: add a test comparing basename similarity and content similarity diffcore-rename: filter rename_src list when possible diffcore-rename: no point trying to find a match better than exact	2021-03-01 14:02:56 -08:00
Junio C Hamano	700696bcfc	Merge branch 'jh/fsmonitor-prework' Preliminary changes to fsmonitor integration. * jh/fsmonitor-prework: fsmonitor: refactor initialization of fsmonitor_last_update token fsmonitor: allow all entries for a folder to be invalidated fsmonitor: log FSMN token when reading and writing the index fsmonitor: log invocation of FSMonitor hook to trace2 read-cache: log the number of scanned files to trace2 read-cache: log the number of lstat calls to trace2 preload-index: log the number of lstat calls to trace2 p7519: add trace logging during perf test p7519: move watchman cleanup earlier in the test p7519: fix watchman watch-list test on Windows p7519: do not rely on "xargs -d" in test	2021-03-01 14:02:56 -08:00
René Scharfe	273c9901c2	pretty: document multiple %(describe) being inconsistent Each %(describe) placeholder is expanded using a separate git describe call. Their outputs depend on the tags present at the time, so there's no consistency guarantee. Document that fact. Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-01 09:50:27 -08:00
René Scharfe	09fe8ca92e	t4205: assert %(describe) test coverage Document that the test is covering both describable and undescribable commits. Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-01 09:42:17 -08:00
Junio C Hamano	90917373cd	Merge https://github.com/prati0100/git-gui * https://github.com/prati0100/git-gui: git-gui: remove lines starting with the comment character git-gui: fix typo in russian locale	2021-03-01 09:22:18 -08:00
Junio C Hamano	c0b27e3964	Merge branch 'js/commit-graph-warning' * js/commit-graph-warning: Revert "commit-graph: when incompatible with graphs, indicate why"	2021-03-01 09:21:24 -08:00
Junio C Hamano	cdc986a7c2	Revert "commit-graph: when incompatible with graphs, indicate why" This reverts commit `c85eec7fc3`, as it is a bit overzealous, we are in prerelease freeze, and we want to have enough time to get this right and cook in 'next'. cf. <8735xgkvuo.fsf@evledraar.gmail.com>	2021-03-01 09:19:37 -08:00
Jeff King	bbabaad298	config.mak.uname: enable OPEN_RETURNS_EINTR for macOS Big Sur We've had mixed reports on whether the latest release of macOS needs this Makefile knob set. In most reported cases, there's antivirus software running (which one might imagine could cause an open() call to be delayed). However, one of the (off-list) reports I've gotten indicated that it happened on an otherwise clean install of Big Sur. Since the symptom is so bad (checkout randomly fails to write several fails when the progress meter kicks in), and since the workaround is so lightweight (if we don't see EINTR, it's just an extra conditional check), let's just turn it on by default. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-01 09:07:45 -08:00
Patrick Steinhardt	23c781f173	githooks.txt: clarify documentation on reference-transaction hook The reference-transaction hook doesn't clearly document its scope and what values it receives as input. Document it to make it less surprising and clearly delimit its (current) scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-01 09:02:01 -08:00
Patrick Steinhardt	5f308a89d8	githooks.txt: replace mentions of SHA-1 specific properties The githooks(5) documentation states in several places that the hook will receive a SHA-1 or hashes of 40 characters length. Given that we're transitioning to a world where both SHA-1 and SHA-256 are supported, this is inaccurate. Fix the issue by replacing mentions of SHA-1 with "object name" and not explicitly mentioning the hash size. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-01 09:02:01 -08:00
Jiang Xin	75f5efcba2	Merge branch 'master' of github.com:nafmo/git-l10n-sv * 'master' of github.com:nafmo/git-l10n-sv: l10n: sv.po: Update Swedish translation (5103t0f0u)	2021-03-01 10:01:02 +08:00
Jiang Xin	0b71d789a8	Merge branch 'pl' of github.com:Arusekk/git-po * 'pl' of github.com:Arusekk/git-po: l10n: pl.po: Update translation	2021-03-01 09:59:07 +08:00
Peter Krefting	fe8885258b	l10n: sv.po: Update Swedish translation (5103t0f0u) Signed-off-by: Peter Krefting <peter@softwolves.pp.se>	2021-02-28 22:22:46 +01:00
Arusekk	fa42d191c6	l10n: pl.po: Update translation Signed-off-by: Arusekk <arek_koz@o2.pl>	2021-02-27 17:17:32 +01:00
Jean-Noël Avila	5ff5a30652	l10n: fr: v2.31.0 rnd 1 Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>	2021-02-27 15:47:45 +01:00
Elijah Newren	81afdf7a2e	diffcore-rename: compute dir_rename_guess from dir_rename_counts dir_rename_counts has a mapping of a mapping, in particular, it has old_dir => { new_dir => count } We want a simple mapping of old_dir => new_dir based on which new_dir had the highest count for a given old_dir. Compute this and store it in dir_rename_guess. This is the final piece of the puzzle needed to make our guesses at which directory files have been moved to when basenames aren't unique. For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 12.775 s ± 0.062 s 12.596 s ± 0.061 s mega-renames: 188.754 s ± 0.284 s 130.465 s ± 0.259 s just-one-mega: 5.599 s ± 0.019 s 3.958 s ± 0.010 s Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:12 -08:00
Elijah Newren	333899e1e3	diffcore-rename: limit dir_rename_counts computation to relevant dirs We are using dir_rename_counts to count the number of other directories that files within a directory moved to. We only need this information for directories that disappeared, though, so we can return early from update_dir_rename_counts() for other paths. If dirs_removed is passed to diffcore_rename_extended(), then it provides the relevant bits of information for us to limit this counting to relevant dirs. If dirs_removed is not passed, we would need to compute some replacement in order to do this limiting. Introduce a new info->relevant_source_dirs variable for this purpose, even though at this stage we will only set it to dirs_removed for simplicity. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:12 -08:00
Elijah Newren	1ad69eb0dc	diffcore-rename: compute dir_rename_counts in stages Compute dir_rename_counts based just on exact renames to start, as that can provide us useful information in find_basename_matches(). This is done by moving the code from compute_dir_rename_counts() into initialize_dir_rename_info(), resulting in it being computed earlier and based just on exact renames. Since that's an incomplete result, we augment the counts via calling update_dir_rename_counts() after each basename-guide and inexact rename detection match is found. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:12 -08:00
Elijah Newren	b1473019e8	diffcore-rename: extend cleanup_dir_rename_info() When diffcore_rename_extended() is passed a NULL dir_rename_count, we will still want to create a temporary one for use by find_basename_matches(), but have it fully deallocated before diffcore_rename_extended() returns. However, when diffcore_rename_extended() is passed a dir_rename_count, we want to fill that strmap with appropriate values and return it. However, for our interim purposes we may also add entries corresponding to directories that cannot have been renamed due to still existing on both sides. Extend cleanup_dir_rename_info() to handle these two different cases, cleaning up the relevant bits of information for each case. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:12 -08:00
Elijah Newren	b6e3d27434	diffcore-rename: move dir_rename_counts into dir_rename_info struct This continues the migration of the directory rename detection code into diffcore-rename, now taking the simple step of combining it with the dir_rename_info struct. Future commits will then make dir_rename_counts be computed in stages, and add computation of dir_rename_guess. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:11 -08:00
Elijah Newren	cd52e0050f	diffcore-rename: add function for clearing dir_rename_count As we adjust the usage of dir_rename_count we want to have a function for clearing, or partially clearing it out. Add a partial_clear_dir_rename_count() function for this purpose. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:11 -08:00
Elijah Newren	0c4fd732f0	Move computation of dir_rename_count from merge-ort to diffcore-rename Move the computation of dir_rename_count from merge-ort.c to diffcore-rename.c, making slight adjustments to the data structures based on the move. While the diffstat looks large, viewing this commit with --color-moved makes it clear that only about 20 lines changed. With this patch, the computation of dir_rename_count is still only done after inexact rename detection, but subsequent commits will add a preliminary computation of dir_rename_count after exact rename detection, followed by some updates after inexact rename detection. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:11 -08:00
Elijah Newren	ae8cf74d3f	diffcore-rename: add a mapping of destination names to their indices Compute a mapping of full filename to the index within rename_dst where that filename is found, and store it in idx_map. idx_possible_rename() needs this to quickly finding an array entry in rename_dst given the pathname. While at it, add placeholder initializations for dir_rename_count and dir_rename_guess; these will be more fully populated in subsequent commits. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:11 -08:00
Elijah Newren	bde8b9f34c	diffcore-rename: provide basic implementation of idx_possible_rename() Add a new struct dir_rename_info with various values we need inside our idx_possible_rename() function introduced in the previous commit. Add a basic implementation for this function showing how we plan to use the variables, but which will just return early with a value of -1 (not found) when those variables are not set up. Future commits will do the work necessary to set up those other variables so that idx_possible_rename() does not always return -1. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:11 -08:00
Elijah Newren	37a2514364	diffcore-rename: use directory rename guided basename comparisons A previous commit noted that it is very common for people to move files across directories while keeping their filename the same. The last few commits took advantage of this and showed that we can accelerate rename detection significantly using basenames; since files with the same basename serve as likely rename candidates, we can check those first and remove them from the rename candidate pool if they are sufficiently similar. Unfortunately, the previous optimization was limited by the fact that the remaining basenames after exact rename detection are not always unique. Many repositories have hundreds of build files with the same name (e.g. Makefile, .gitignore, build.gradle, etc.), and may even have hundreds of source files with the same name. (For example, the linux kernel has 100 setup.c, 87 irq.c, and 112 core.c files. A repository at $DAYJOB has a lot of ObjectFactory.java and Plugin.java files). For these files with non-unique basenames, we are faced with the task of attempting to determine or guess which directory they may have been relocated to. Such a task is precisely the job of directory rename detection. However, there are two catches: (1) the directory rename detection code has traditionally been part of the merge machinery rather than diffcore-rename.c, and (2) directory rename detection currently runs after regular rename detection is complete. The 1st catch is just an implementation issue that can be overcome by some code shuffling. The 2nd requires us to add a further approximation: we only have access to exact renames at this point, so we need to do directory rename detection based on just exact renames. In some cases we won't have exact renames, in which case this extra optimization won't apply. We also choose to not apply the optimization unless we know that the underlying directory was removed, which will require extra data to be passed in to diffcore_rename_extended(). Also, even if we get a prediction about which directory a file may have relocated to, we will still need to check to see if there is a file in the predicted directory, and then compare the two files to see if they meet the higher min_basename_score threshold required for marking the two files as renames. This commit introduces an idx_possible_rename() function which will do this directory rename detection for us and give us the index within rename_dst of the resulting filename. For now, this function is hardcoded to return -1 (not found) and just hooks up how its results would be used once we have a more complete implementation in place. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 17:53:11 -08:00
Taylor Blau	66f52fa26b	pack-revindex.c: don't close unopened file descriptors When opening a reverse index, load_revindex_from_disk() jumps to the 'cleanup' label in case something goes wrong: the reverse index had the wrong size, an unrecognized version, or similar. It also jumps to this label when the reverse index couldn't be opened in the first place, which will cause an error with the unguarded close() call in the label. Guard this call with "if (fd >= 0)" to make sure that we have a valid file descriptor to close before attempting to close it. Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 14:42:27 -08:00
Jeff King	36e834abc1	t/perf: avoid copying worktree files from test repo When running the perf suite, we copy files from an existing $GIT_DIR to a scratch repository to give us a realistic setup on which to operate. Since the perf scripts themselves may modify the scratch repository, we want to make sure we've scrubbed any references back to the original. One existing example is that we avoid copying the file "commondir" at the top-level of the repository. In a worktree git-dir (e.g., .git/worktrees/foo), that file contains the path to the parent repository; copying it could mean ref updates in the scratch repository affect the original. But there are other files we should cover, too: - "gitdir" in a worktree git-dir contains the path to the actual .git file in the working tree. We _shouldn't_ end up looking at it at all, since the lack of a "commondir" file means Git won't consider this to be a worktree git-dir. But it's best to err on the safe side. - in a parent repository that contains worktrees, the "$GIT_DIR/worktrees" directory will contain the git dirs for the individual worktrees. Which will themselves contain commondir and gitdir files that may reference the original repository. We should likewise remove them. Note that this does mean that the perf suite's scratch repositories will never have any worktrees. That's OK; we don't have any perf tests that are influenced by their presence. If we add any, they'd probably want to create the worktrees themselves anyway. This patch adds both paths to the set of omissions in test_perf_copy_repo_contents(). Note that we won't get confused here by matching arbitrary names like refs/heads/commondir. This list is always matching top-level entries in $GIT_DIR (we rely on "cp -R" to do the actual recursion). Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 14:21:04 -08:00
Jeff King	85b87a5396	t/perf: handle worktrees as test repos The perf suite gets confused when test_perf_default_repo is pointed at a worktree (which includes when it is run from within a worktree at all, since the default is to use the current repository). Here's an example: $ git worktree add ~/foo Preparing worktree (new branch 'foo') HEAD is now at `328c109303` The eighth batch $ cd ~/foo $ make [...build output...] $ cd t/perf $ ./p0000-perf-lib-sanity.sh -v -i [...] perf 1 - test_perf_default_repo works: running: foo=$(git rev-parse HEAD) && test_export foo fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' The problem is that we didn't copy all of the necessary files from the source repository (in this case we got HEAD, but we have no refs!). We discover the git-dir with "rev-parse --git-dir", but this points to the worktree's partial repository in .../.git/worktrees/foo. That partial repository has a "commondir" file which points to the main repository, where the actual refs are stored, but we don't copy it. This is the correct thing to do, though! If we did copy it, then our scratch test repo would be pointing back to the original main repo, and any ref updates we made in the tests would impact that original repo. Instead, we need to either: 1. Make a scratch copy of the original main repo (in addition to the worktree repo), and point the scratch worktree repo's commondir at it. This preserves the original relationship, but it's doubtful any script really cares (if they are testing worktree performance, they'd probably make their own worktrees). And it's trickier to get right. 2. Collapse the main and worktree repos into a single scratch repo. This can be done by copying everything from both, preferring any files from the worktree repo. This patch does the second one. With this applied, the example above results in p0000 running successfully. Reported-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 14:21:04 -08:00
Jeff King	2b08101204	Makefile: add OPEN_RETURNS_EINTR knob On some platforms, open() reportedly returns EINTR when opening regular files and we receive a signal (usually SIGALRM from our progress meter). This shouldn't happen, as open() should be a restartable syscall, and we specify SA_RESTART when setting up the alarm handler. So it may actually be a kernel or libc bug for this to happen. But it has been reported on at least one version of Linux (on a network filesystem): https://lore.kernel.org/git/c8061cce-71e4-17bd-a56a-a5fed93804da@neanderfunk.de/ as well as on macOS starting with Big Sur even on a regular filesystem. We can work around it by retrying open() calls that get EINTR, just as we do for read(), etc. Since we don't ever _want_ to interrupt an open() call, we can get away with just redefining open, rather than insisting all callsites use xopen(). We actually do have an xopen() wrapper already (and it even does this retry, though there's no indication of it being an observed problem back then; it seems simply to have been lifted from xread(), etc). But it is used hardly anywhere, and isn't suitable for general use because it will die() on error. In theory we could combine the two, but it's awkward to do so because of the variable-args interface of open(). This patch adds a Makefile knob for enabling the workaround. It's not enabled by default for any platforms in config.mak.uname yet, as we don't have enough data to decide how common this is (I have not been able to reproduce on either Linux or Big Sur myself). It may be worth enabling preemptively anyway, since the cost is pretty low (if we don't see an EINTR, it's just an extra conditional). However, note that we must not enable this on Windows. It doesn't do anything there, and the macro overrides the existing mingw_open() redirection. I've added a preemptive #undef here in the mingw header (which is processed first) to just quietly disable it (we could also make it an #error, but there is little point in being so aggressive). Reported-by: Aleksey Kliger <alklig@microsoft.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 14:15:51 -08:00
Matheus Tavares	6fab35f748	convert: fail gracefully upon missing clean cmd on required filter The gitattributes documentation mentions that either the clean cmd or the smudge cmd can be left unspecified in a filter definition. However, when the filter is marked as 'required', the absence of any one of these two should be treated as an error. Git already fails under these circumstances, but not always in a pleasant way: omitting a clean cmd in a required filter triggers an assertion error which leaves the user with a quite verbose message: git: convert.c:1459: convert_to_git_filter_fd: Assertion "ca.drv->clean \|\| ca.drv->process" failed. This assertion is not really necessary, as the apply_filter() call below it already performs the same check. And when this condition is not met, the function returns 0, making the caller die() with a much nicer message. (Also note that die()-ing here is the right behavior as `would_convert_to_git_filter_fd() == true` is a precondition to use convert_to_git_filter_fd(), and the former is only true when the filter is required.) So remove the assertion and add two regression tests to make sure that git fails nicely when either the smudge or clean command is missing on a required filter. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 11:20:02 -08:00
Jiang Xin	712b0ed6ec	l10n: git.pot: v2.31.0 round 1 (155 new, 89 removed) Generate po/git.pot from v2.31.0-rc0 for git v2.31.0 l10n round 1. Signed-off-by: Jiang Xin <worldhello.net@gmail.com>	2021-02-26 22:09:42 +08:00
Junio C Hamano	225365fb51	Git 2.31-rc0 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-25 16:43:33 -08:00
Junio C Hamano	140045821a	Merge branch 'jc/push-delete-nothing' "git push $there --delete ''" should have been diagnosed as an error, but instead turned into a matching push, which has been corrected. * jc/push-delete-nothing: push: do not turn --delete '' into a matching push	2021-02-25 16:43:33 -08:00
Junio C Hamano	cadae717d5	Merge branch 'sh/mergetools-vimdiff1' Mergetools update. * sh/mergetools-vimdiff1: mergetools/vimdiff: add vimdiff1 merge tool variant	2021-02-25 16:43:32 -08:00
Junio C Hamano	09e72204f8	Merge branch 'dl/doc-config-camelcase' A handful of multi-word configuration variable names in documentation that are spelled in all lowercase have been corrected to use the more canonical camelCase. * dl/doc-config-camelcase: index-format doc: camelCase core.excludesFile blame-options.txt: camelcase blame.blankBoundary i18n.txt: camel case and monospace "i18n.commitEncoding"	2021-02-25 16:43:32 -08:00
Junio C Hamano	1c8f5dfa42	Merge branch 'js/params-vs-args' Messages update. * js/params-vs-args: replace "parameters" by "arguments" in error messages	2021-02-25 16:43:32 -08:00
Junio C Hamano	d228b6b231	Merge branch 'ug/doc-commit-approxidate' Doc update. * ug/doc-commit-approxidate: doc: mention approxidates for git-commit --date	2021-02-25 16:43:32 -08:00
Junio C Hamano	d166e8c1d4	Merge branch 'es/maintenance-of-bare-repositories' The "git maintenance register" command had trouble registering bare repositories, which had been corrected. * es/maintenance-of-bare-repositories: maintenance: fix incorrect `maintenance.repo` path with bare repository	2021-02-25 16:43:32 -08:00
Junio C Hamano	f277234860	Merge branch 'mt/add-chmod-fixes' Various fixes on "git add --chmod". * mt/add-chmod-fixes: add: propagate --chmod errors to exit status add: mark --chmod error string for translation add --chmod: don't update index when --dry-run is used	2021-02-25 16:43:31 -08:00
Junio C Hamano	48923e8356	Merge branch 'ds/merge-base-independent' The code to implement "git merge-base --independent" was poorly done and was kept from the very beginning of the feature. * ds/merge-base-independent: commit-reach: stale commits may prune generation further commit-reach: use heuristic in remove_redundant() commit-reach: move compare_commits_by_gen commit-reach: use one walk in remove_redundant() commit-reach: reduce requirements for remove_redundant()	2021-02-25 16:43:31 -08:00
Junio C Hamano	682bbad64d	Merge branch 'ah/rebase-no-fork-point-config' "git rebase --[no-]fork-point" gained a configuration variable rebase.forkPoint so that users do not have to keep specifying a non-default setting. * ah/rebase-no-fork-point-config: rebase: add a config option for --no-fork-point	2021-02-25 16:43:31 -08:00
Junio C Hamano	628c13ccee	Merge branch 'mt/grep-sparse-checkout' "git grep" has been tweaked to be limited to the sparse checkout paths. * mt/grep-sparse-checkout: grep: honor sparse-checkout on working tree searches	2021-02-25 16:43:31 -08:00
Junio C Hamano	3c8e6dda21	Merge branch 'ah/commit-graph-leakplug' Plug a minor memory leak. * ah/commit-graph-leakplug: commit-graph: avoid leaking topo_levels slab in write_commit_graph()	2021-02-25 16:43:31 -08:00
Junio C Hamano	6eea44cee1	Merge branch 'zh/difftool-skip-to' "git difftool" learned "--skip-to=<path>" option to restart an interrupted session from an arbitrary path. * zh/difftool-skip-to: difftool.c: learn a new way start at specified file	2021-02-25 16:43:31 -08:00
Junio C Hamano	ccf6861b72	Merge branch 'cw/pack-config-doc' Doc update. * cw/pack-config-doc: doc: mention bigFileThreshold for packing	2021-02-25 16:43:31 -08:00
Junio C Hamano	dddb420535	Merge branch 'jc/maint-column-doc-typofix' Doc update. * jc/maint-column-doc-typofix: Documentation: typofix --column description	2021-02-25 16:43:30 -08:00
Junio C Hamano	2638e33c82	Merge branch 'ma/doc-markup-fix' Docfix. * ma/doc-markup-fix: gitmailmap.txt: fix rendering of e-mail addresses git.txt: fix monospace rendering rev-list-options.txt: fix rendering of bonus paragraph	2021-02-25 16:43:30 -08:00
Junio C Hamano	845d6030f8	Merge branch 'jc/diffcore-rotate' "git {diff,log} --{skip,rotate}-to=<path>" allows the user to discard diff output for early paths or move them to the end of the output. * jc/diffcore-rotate: diff: --{rotate,skip}-to=<path>	2021-02-25 16:43:30 -08:00
Junio C Hamano	3da165ca28	Merge branch 'mt/checkout-index-corner-cases' The error codepath around the "--temp/--prefix" feature of "git checkout-index" has been improved. * mt/checkout-index-corner-cases: checkout-index: omit entries with no tempname from --temp output write_entry(): fix misuses of `path` in error messages	2021-02-25 16:43:30 -08:00
Junio C Hamano	f47c3328ef	Merge branch 'js/doc-proto-v2-response-end' Docfix. * js/doc-proto-v2-response-end: doc: fix naming of response-end-pkt	2021-02-25 16:43:30 -08:00
Junio C Hamano	18decfd11d	Merge branch 'rs/blame-optim' Optimization in "git blame" * rs/blame-optim: blame: remove unnecessary use of get_commit_info()	2021-02-25 16:43:29 -08:00
Junio C Hamano	d590ae5560	Merge branch 'mz/doc-notes-are-not-anchors' Objects that lost references can be pruned away, even when they have notes attached to it (and these notes will become dangling, which in turn can be pruned with "git notes prune"). This has been clarified in the documentation. * mz/doc-notes-are-not-anchors: docs: clarify that refs/notes/ do not keep the attached objects alive	2021-02-25 16:43:29 -08:00
Junio C Hamano	608cc4f273	Merge branch 'ab/detox-gettext-tests' Removal of GIT_TEST_GETTEXT_POISON continues. * ab/detox-gettext-tests: tests: remove most uses of test_i18ncmp tests: remove last uses of C_LOCALE_OUTPUT tests: remove most uses of C_LOCALE_OUTPUT tests: remove last uses of GIT_TEST_GETTEXT_POISON=false	2021-02-25 16:43:29 -08:00
Junio C Hamano	6fe12b5215	Merge branch 'jk/rev-list-disk-usage' "git rev-list" command learned "--disk-usage" option. * jk/rev-list-disk-usage: docs/rev-list: add some examples of --disk-usage docs/rev-list: add an examples section rev-list: add --disk-usage option for calculating disk usage t: add --no-tag option to test_commit	2021-02-25 16:43:29 -08:00
Derrick Stolee	702110aac6	commit-graph: use config to specify generation type We have two established generation number versions: 1: topological levels 2: corrected commit dates The corrected commit dates are enabled by default, but they also write extra data in the GDAT and GDOV chunks. Services that host Git data might want to have more control over when this feature rolls out than just updating the Git binaries. Add a new "commitGraph.generationVersion" config option that specifies the intended generation number version. If this value is less than 2, then the GDAT chunk is never written _or read_ from an existing file. This can replace our use of the GIT_TEST_COMMIT_GRAPH_NO_GDAT environment variable in the test suite. Remove it. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-25 15:10:41 -08:00
Derrick Stolee	c7ef8fe608	commit-graph: create local repository pointer The write_commit_graph() method uses 'the_repository' in a few places. A new need for a repository pointer is coming in the following change, so group these instances into a local variable 'r' that could eventually become part of the method signature, if so desired. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-25 15:10:40 -08:00
Ævar Arnfjörð Bjarmason	0f1da600e6	remote: write camel-cased .pushRemote on rename When a remote is renamed don't change the canonical ".pushRemote" form to ".pushremote". Fixes and tests for a minor bug in `923d4a5ca4` (remote rename/remove: handle branch.<name>.pushRemote config values, 2020-01-27). See the preceding commit for why this does & doesn't matter. While we're at it let's also test that we handle the ".pushDefault" key correctly. The code to handle that was added in `b3fd6cbf29` (remote rename/remove: gently handle remote.pushDefault config, 2020-02-01) and does the right thing, but nothing tested that we wrote out the canonical camel-cased form. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 19:03:00 -08:00
Ævar Arnfjörð Bjarmason	bfa9148ff7	remote: add camel-cased .tagOpt key, like clone Change "git remote add" so that it adds a .tagOpt key, and not the lower-cased .tagopt on "git remote add --no-tags", just as "git clone --no-tags" would do. This doesn't matter for anything that reads the config. It's just prettier if we write config keys in their documented camelCase form to user-readable config files. When I added support for "clone -no-tags" in `0dab2468ee` (clone: add a --no-tags option to clone without tags, 2017-04-26) I made it use the .tagOpt form, but the older "git remote add" added in `111fb85865` (remote add: add a --[no-]tags option, 2010-04-20) has been using *.tagopt all this time. It's easy enough to add a test for this, so let's do that. We can't use "git config -l" there, because it'll normalize the keys to their lower-cased form. Let's add the test for "git clone" too for good measure, not just to the "git remote" codepath we're fixing. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 19:02:58 -08:00
Junio C Hamano	11875561bf	Merge branch 'ds/chunked-file-api' into tb/reverse-midx * ds/chunked-file-api: commit-graph.c: display correct number of chunks when writing chunk-format: add technical docs chunk-format: restore duplicate chunk checks midx: use 64-bit multiplication for chunk sizes midx: use chunk-format read API commit-graph: use chunk-format read API chunk-format: create read chunk API midx: use chunk-format API in write_midx_internal() midx: drop chunk progress during write midx: return success/failure in chunk write methods midx: add num_large_offsets to write_midx_context midx: add pack_perm to write_midx_context midx: add entries to write_midx_context midx: use context in write_midx_pack_names() midx: rename pack_info to write_midx_context commit-graph: use chunk-format write API chunk-format: create chunk format write API commit-graph: anonymize data in chunk_write_fn	2021-02-24 15:26:14 -08:00
Junio C Hamano	7dd0eaa39c	index-format doc: camelCase core.excludesFile Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 15:21:25 -08:00
Junio C Hamano	edaf10dd26	blame-options.txt: camelcase blame.blankBoundary All other references to blame.* configuration variables are camelCased already. Update this one to match. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 15:21:25 -08:00
Denton Liu	77645b5daa	i18n.txt: camel case and monospace "i18n.commitEncoding" In `95791be750` (doc: camelCase the i18n config variables to improve readability, 2017-07-17), the other i18n config variables were camel cased. However, this one instance was missed. Camel case and monospace "i18n.commitEncoding" so that it matches the surrounding text. Signed-off-by: Denton Liu <liu.denton@gmail.com> [jc: fixed 3 other mistakes that are exactly the same] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 15:21:25 -08:00
Neeraj Singh	f279894d28	read-cache: make the index write buffer size 128K Writing an index 8K at a time invokes the OS filesystem and caching code very frequently, introducing noticeable overhead while writing large indexes. When experimenting with different write buffer sizes on Windows writing the Windows OS repo index (260MB), most of the benefit came by bumping the index write buffer size to 64K. I picked 128K to ensure that we're past the knee of the curve. With this change, the time under do_write_index for an index with 3M files goes from ~1.02s to ~0.72s. Signed-off-by: Neeraj Singh <neerajsi@ntdev.microsoft.com> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 13:40:30 -08:00
Matheus Tavares	9ebd7fe158	add: propagate --chmod errors to exit status If `add` encounters an error while applying the --chmod changes, it prints a message to stderr, but exits with a success code. This might have been an oversight, as the command does exit with a non-zero code in other situations where it cannot (or refuses to) update all of the requested paths (e.g. when some of the given paths are ignored). So make the exit behavior more consistent by also propagating --chmod errors to the exit status. Note: the test "all statuses changed in folder if . is given" uses paths added by previous test cases, some of which might be symbolic links. Because `git add --chmod` will now fail with such paths, this test would depend on whether all the previous tests were executed, or only some of them. Avoid that by running the test on a fresh repo with only regular files. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 12:14:51 -08:00
Matheus Tavares	48960894f5	add: mark --chmod error string for translation This error message is intended for humans, so mark it for translation. Also use error() instead of fprintf(stderr, ...), to make the corresponding line a bit cleaner, and to display the "error:" prefix, which helps classifying the nature/severity of the message. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 12:14:51 -08:00
Matheus Tavares	c937d70bfb	add --chmod: don't update index when --dry-run is used `git add --chmod` applies the mode changes even when `--dry-run` is used. Fix that and add some tests for this option combination. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 12:14:51 -08:00
Jeff Hostetler	6347d649bc	dir: fix malloc of root untracked_cache_dir Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir` for the root directory. Get rid of unsafe code that might fail to initialize the `name` field (if FLEX_ARRAY is not 1). This will make it clear that we intend to have a structure with an empty string following it. A problem was observed on Windows where the length of the memset() was too short, so the first byte of the name field was not zeroed. This resulted in the name field having garbage from a previous use of that area of memory. The record for the root directory was then written to the untracked-cache extension in the index. This garbage would then be visible to future commands when they reloaded the untracked-cache extension. Since the directory record for the root directory had garbage in the `name` field, the `t/helper/test-tool dump-untracked-cache` tool printed this garbage as the path prefix (rather than '/') for each directory in the untracked cache as it recursed. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 12:09:10 -08:00
Alex Henrie	2803d800d2	rebase: add a config option for --no-fork-point Some users (myself included) would prefer to have this feature off by default because it can silently drop commits. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 11:49:10 -08:00
Taylor Blau	c4ff24bbb3	commit-graph.c: display correct number of chunks when writing When writing a commit-graph, a progress meter is shown which indicates the number of pieces of data to write (one per commit in each chunk). In `47410aa837` (commit-graph: use chunk-format write API, 2021-02-18), the number of chunks became tracked by the new chunk-format API. But a stray local variable was left behind from when write_commit_graph_file() used to keep track of the same. Since this was no longer updated after `47410aa837`, the progress meter appeared broken: $ git commit-graph write --reachable Expanding reachable commits in commit graph: 837569, done. Writing out commit graph in 3 passes: 166% (4187845/2512707), done. Drop the local variable and rely instead on the chunk-format API to tell us the correct number of chunks. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 11:44:34 -08:00
Jordi Mas	a9926ecd54	l10n: Update Catalan translation Signed-off-by: Jordi Mas <jmas@softcatala.org>	2021-02-24 09:14:56 +01:00
Junio C Hamano	20e416409f	push: do not turn --delete '' into a matching push When we added a syntax sugar "git push remote --delete <ref>" to "git push" as a synonym to the canonical "git push remote :<ref>" syntax at `f517f1f2` (builtin-push: add --delete as syntactic sugar for :foo, 2009-12-30), we weren't careful enough to make sure that <ref> is not empty. Blindly rewriting "--delete <ref>" to ":<ref>" means that an empty string <ref> results in refspec ":", which is the syntax to ask for "matching" push that does not delete anything. Worse yet, if there were matching refs that can be fast-forwarded, they would have been published prematurely, even if the user feels that they are not ready yet to be pushed out, which would be a real disaster. Noticed-by: Tilman Vogel <tilman.vogel@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 15:19:34 -08:00
Jeff King	01168a9d89	doc: mention approxidates for git-commit --date We describe the more strict date formats accepted by GIT_COMMITTER_DATE, etc, but the --date option also allows the looser approxidate formats, as well. Unfortunately we don't have a good or complete reference for this format, but let's at least mention that it _is_ looser, and give a few examples. If we ever write separate, more complete date-format documentation, we should refer to it from here. Based-on-a-patch-by: Utku Gultopu <ugultopu@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 13:33:02 -08:00
Johannes Sixt	b865734760	replace "parameters" by "arguments" in error messages When an error message informs the user about an incorrect command invocation, it should refer to "arguments", not "parameters". Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 13:30:45 -08:00
Seth House	30bb8088af	mergetools/vimdiff: add vimdiff1 merge tool variant This adds yet another vimdiff/gvimdiff variant and presents conflicts as a two-way diff between 'LOCAL' and 'REMOTE'. 'MERGED' is not opened which deviates from the norm so usage text is echoed as a Vim message on startup that instructs the user with how to proceed and how to abort. Vimdiff is well-suited to two-way diffs so this is an option for a more simple, more streamlined conflict resolution. For example: it is difficult to communicate differences across more than two files using only syntax highlighting; default vimdiff commands to get and put changes between buffers do not need the user to manually specify a source or destination buffer when only using two buffers. Like other merge tools that directly compare 'LOCAL' with 'REMOTE', this tool will benefit when paired with the new `mergetool.hideResolved` setting. Signed-off-by: Seth House <seth@eseth.com> Tested-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 11:37:13 -08:00
Han-Wen Nienhuys	00f68732e5	doc/reftable: document how to handle windows On Windows we can't delete or overwrite files opened by other processes. Here we sketch how to handle this situation. We propose to use a random element in the filename. It's possible to design an alternate solution based on counters, but that would assign semantics to the filenames that complicates implementation. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 10:01:21 -08:00
Ævar Arnfjörð Bjarmason	029bac01a8	Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets Add targets to compile the various .o files we declared in commonly used _OBJS variables. This is useful for debugging purposes, to e.g. get to the point where we can compile a git.o. See [1] for a use-case for this target. https://lore.kernel.org/git/YBCGtd9if0qtuQxx@coredump.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 09:57:59 -08:00
Ævar Arnfjörð Bjarmason	abc3c87f3d	Makefile: split OBJECTS into OBJECTS and GIT_OBJS Add a new GIT_OBJS variable, with the objects sufficient to get to a git.o or common-main.o. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 09:57:58 -08:00
Ævar Arnfjörð Bjarmason	d6da8b328e	Makefile: sort OBJECTS assignment for subsequent change Change the order of the OBJECTS assignment, this makes a follow-up change where we split it up into two variables smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 09:57:58 -08:00
Ævar Arnfjörð Bjarmason	752b3ef972	Makefile: split up long OBJECTS line Split up the long OBJECTS line into multiple lines using the "+=" assignment we commonly use elsewhere in the Makefile when these lines get unwieldy. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 09:57:58 -08:00
Ævar Arnfjörð Bjarmason	bed3419925	Makefile: guard against TEST_OBJS in the environment Add TEST_OBJS to the list of other *_OBJS variables we reset. We had already established this pattern when TEST_OBJS was introduced in `daa99a9172` (Makefile: make sure test helpers are rebuilt when headers change, 2010-01-26), but it wasn't added to the list in that commit along with the rest. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 09:57:58 -08:00
Eric Sunshine	26c7974376	maintenance: fix incorrect `maintenance.repo` path with bare repository The periodic maintenance tasks configured by `git maintenance start` invoke `git for-each-repo` to run `git maintenance run` on each path specified by the multi-value global configuration variable `maintenance.repo`. Because `git for-each-repo` will likely be run outside of the repositories which require periodic maintenance, it is mandatory that the repository paths specified by `maintenance.repo` are absolute. Unfortunately, however, `git maintenance register` does nothing to ensure that the paths it assigns to `maintenance.repo` are indeed absolute, and may in fact -- especially in the case of a bare repository -- assign a relative path to `maintenance.repo` instead. Fix this problem by converting all paths to absolute before assigning them to `maintenance.repo`. While at it, also fix `git maintenance unregister` to convert paths to absolute, as well, in order to ensure that it can correctly remove from `maintenance.repo` a path assigned via `git maintenance register`. Reported-by: Clement Moyroud <clement.moyroud@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 00:22:45 -08:00
Taylor Blau	0fabafd0b9	builtin/repack.c: add '--geometric' option Often it is useful to both: - have relatively few packfiles in a repository, and - avoid having so few packfiles in a repository that we repack its entire contents regularly This patch implements a '--geometric=<n>' option in 'git repack'. This allows the caller to specify that they would like each pack to be at least a factor times as large as the previous largest pack (by object count). Concretely, say that a repository has 'n' packfiles, labeled P1, P2, ..., up to Pn. Each packfile has an object count equal to 'objects(Pn)'. With a geometric factor of 'r', it should be that: objects(Pi) > r*objects(P(i-1)) for all i in [1, n], where the packs are sorted by objects(P1) <= objects(P2) <= ... <= objects(Pn). Since finding a true optimal repacking is NP-hard, we approximate it along two directions: 1. We assume that there is a cutoff of packs _before starting the repack_ where everything to the right of that cut-off already forms a geometric progression (or no cutoff exists and everything must be repacked). 2. We assume that everything smaller than the cutoff count must be repacked. This forms our base assumption, but it can also cause even the "heavy" packs to get repacked, for e.g., if we have 6 packs containing the following number of objects: 1, 1, 1, 2, 4, 32 then we would place the cutoff between '1, 1' and '1, 2, 4, 32', rolling up the first two packs into a pack with 2 objects. That breaks our progression and leaves us: 2, 1, 2, 4, 32 ^ (where the '^' indicates the position of our split). To restore a progression, we move the split forward (towards larger packs) joining each pack into our new pack until a geometric progression is restored. Here, that looks like: 2, 1, 2, 4, 32 ~> 3, 2, 4, 32 ~> 5, 4, 32 ~> ... ~> 9, 32 ^ ^ ^ ^ This has the advantage of not repacking the heavy-side of packs too often while also only creating one new pack at a time. Another wrinkle is that we assume that loose, indexed, and reflog'd objects are insignificant, and lump them into any new pack that we create. This can lead to non-idempotent results. Suggested-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Jeff King	20b031fede	packfile: add kept-pack cache for find_kept_pack_entry() In a recent patch we added a function 'find_kept_pack_entry()' to look for an object only among kept packs. While this function avoids doing any lookup work in non-kept packs, it is still linear in the number of packs, since we have to traverse the linked list of packs once per object. Let's cache a reduced version of that list to save us time. Note that this cache will last the lifetime of the program. We could invalidate it on reprepare_packed_git(), but there's not much point in being rigorous here: - we might already fail to notice new .keep packs showing up after the program starts. We only reprepare_packed_git() when we fail to find an object. But adding a new pack won't cause that to happen. Somebody repacking could add a new pack and delete an old one, but most of the time we'd have a descriptor or mmap open to the old pack anyway, so we might not even notice. - in pack-objects we already cache the .keep state at startup, since `56dfeb6263` (pack-objects: compute local/ignore_pack_keep early, 2016-07-29). So this is just extending that concept further. - we don't have to worry about any packed_git being removed; we always keep the old structs around, even after reprepare_packed_git() We do defensively invalidate the cache in case the set of kept packs being asked for changes (e.g., only in-core kept packs were cached, but suddenly the caller also wants on-disk kept packs, too). In theory we could build all three caches and switch between them, but it's not necessary, since this patch (and series) never changes the set of kept packs that it wants to inspect from the cache. So that "optimization" is more about being defensive in the face of future changes than it is about asking for multiple kinds of kept packs in this patch. Here are p5303 results (as always, measured against the kernel): Test HEAD^ HEAD ----------------------------------------------------------------------------------------------- 5303.5: repack (1) 57.34(54.66+10.88) 56.98(54.36+10.98) -0.6% 5303.6: repack with kept (1) 57.38(54.83+10.49) 57.17(54.97+10.26) -0.4% 5303.11: repack (50) 71.70(88.99+4.74) 71.62(88.48+5.08) -0.1% 5303.12: repack with kept (50) 72.58(89.61+4.78) 71.56(88.80+4.59) -1.4% 5303.17: repack (1000) 217.19(491.72+14.25) 217.31(490.82+14.53) +0.1% 5303.18: repack with kept (1000) 246.12(520.07+14.93) 217.08(490.37+15.10) -11.8% and the --stdin-packs case, which scales a little bit better (although not by that much even at 1,000 packs): 5303.7: repack with --stdin-packs (1) 0.00(0.00+0.00) 0.00(0.00+0.00) = 5303.13: repack with --stdin-packs (50) 3.43(11.75+0.24) 3.43(11.69+0.30) +0.0% 5303.19: repack with --stdin-packs (1000) 130.50(307.15+7.66) 125.13(301.36+8.04) -4.1% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Jeff King	6325da14af	builtin/pack-objects.c: rewrite honor-pack-keep logic Now that we have find_kept_pack_entry(), we don't have to manually keep hunting through every pack to find a possible "kept" duplicate of the object. This should be faster, assuming only a portion of your total packs are actually kept. Note that we have to re-order the logic a bit here; we can deal with the disqualifying situations first (e.g., finding the object in a non-local pack with --local), then "kept" situation(s), and then just fall back to other "--local" conditions. Here are the results from p5303 (measurements again taken on the kernel): Test HEAD^ HEAD ----------------------------------------------------------------------------------------------- 5303.5: repack (1) 57.26(54.59+10.84) 57.34(54.66+10.88) +0.1% 5303.6: repack with kept (1) 57.33(54.80+10.51) 57.38(54.83+10.49) +0.1% 5303.11: repack (50) 71.54(88.57+4.84) 71.70(88.99+4.74) +0.2% 5303.12: repack with kept (50) 85.12(102.05+4.94) 72.58(89.61+4.78) -14.7% 5303.17: repack (1000) 216.87(490.79+14.57) 217.19(491.72+14.25) +0.1% 5303.18: repack with kept (1000) 665.63(938.87+15.76) 246.12(520.07+14.93) -63.0% and the --stdin-packs timings: 5303.7: repack with --stdin-packs (1) 0.01(0.01+0.00) 0.00(0.00+0.00) -100.0% 5303.13: repack with --stdin-packs (50) 3.53(12.07+0.24) 3.43(11.75+0.24) -2.8% 5303.19: repack with --stdin-packs (1000) 195.83(371.82+8.10) 130.50(307.15+7.66) -33.4% So our repack with an empty .keep pack is roughly as fast as one without a .keep pack up to 50 packs. But the --stdin-packs case scales a little better, too. Notably, it is faster than a repack of the same size and a kept pack. It looks at fewer objects, of course, but the penalty for looking at many packs isn't as costly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Jeff King	fbf20aeeef	p5303: measure time to repack with keep Add two new tests to measure repack performance. Both tests split the repository into synthetic "pushes", and then leave the remaining objects in a big base pack. The first new test marks an empty pack as "kept" and then passes --honor-pack-keep to avoid including objects in it. That doesn't change the resulting pack, but it does let us compare to the normal repack case to see how much overhead we add to check whether objects are kept or not. The other test is of --stdin-packs, which gives us a sense of how that number scales based on the number of packs we provide as input. In each of those tests, the empty pack isn't considered, but the residual pack (objects that were left over and not included in one of the synthetic push packs) is marked as kept. (Note that in the single-pack case of the --stdin-packs test, there is nothing do since there are no non-excluded packs). Here are some timings on a recent clone of the kernel: 5303.5: repack (1) 57.26(54.59+10.84) 5303.6: repack with kept (1) 57.33(54.80+10.51) in the 50-pack case, things start to slow down: 5303.11: repack (50) 71.54(88.57+4.84) 5303.12: repack with kept (50) 85.12(102.05+4.94) and by the time we hit 1,000 packs, things are substantially worse, even though the resulting pack produced is the same: 5303.17: repack (1000) 216.87(490.79+14.57) 5303.18: repack with kept (1000) 665.63(938.87+15.76) That's because the code paths around handling .keep files are known to scale badly; they look in every single pack file to find each object. Our solution to that was to notice that most repos don't have keep files, and to make that case a fast path. But as soon as you add a single .keep, that part of pack-objects slows down again (even if we have fewer objects total to look at). Likewise, the scaling is pretty extreme on --stdin-packs (but each subsequent test is also being asked to do more work): 5303.7: repack with --stdin-packs (1) 0.01(0.01+0.00) 5303.13: repack with --stdin-packs (50) 3.53(12.07+0.24) 5303.19: repack with --stdin-packs (1000) 195.83(371.82+8.10) Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Jeff King	60bb5f2f5d	p5303: add missing &&-chains These are in a helper function, so the usual chain-lint doesn't notice them. This function is still not perfect, as it has some git invocations on the left-hand-side of the pipe, but it's primary purpose is timing, not finding bugs or correctness issues. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Taylor Blau	339bce27f4	builtin/pack-objects.c: add '--stdin-packs' option In an upcoming commit, 'git repack' will want to create a pack comprised of all of the objects in some packs (the included packs) excluding any objects in some other packs (the excluded packs). This caller could iterate those packs themselves and feed the objects it finds to 'git pack-objects' directly over stdin, but this approach has a few downsides: - It requires every caller that wants to drive 'git pack-objects' in this way to implement pack iteration themselves. This forces the caller to think about details like what order objects are fed to pack-objects, which callers would likely rather not do. - If the set of objects in included packs is large, it requires sending a lot of data over a pipe, which is inefficient. - The caller is forced to keep track of the excluded objects, too, and make sure that it doesn't send any objects that appear in both included and excluded packs. But the biggest downside is the lack of a reachability traversal. Because the caller passes in a list of objects directly, those objects don't get a namehash assigned to them, which can have a negative impact on the delta selection process, causing 'git pack-objects' to fail to find good deltas even when they exist. The caller could formulate a reachability traversal themselves, but the only way to drive 'git pack-objects' in this way is to do a full traversal, and then remove objects in the excluded packs after the traversal is complete. This can be detrimental to callers who care about performance, especially in repositories with many objects. Introduce 'git pack-objects --stdin-packs' which remedies these four concerns. 'git pack-objects --stdin-packs' expects a list of pack names on stdin, where 'pack-xyz.pack' denotes that pack as included, and '^pack-xyz.pack' denotes it as excluded. The resulting pack includes all objects that are present in at least one included pack, and aren't present in any excluded pack. To address the delta selection problem, 'git pack-objects --stdin-packs' works as follows. First, it assembles a list of objects that it is going to pack, as above. Then, a reachability traversal is started, whose tips are any commits mentioned in included packs. Upon visiting an object, we find its corresponding object_entry in the to_pack list, and set its namehash parameter appropriately. To avoid the traversal visiting more objects than it needs to, the traversal is halted upon encountering an object which can be found in an excluded pack (by marking the excluded packs as kept in-core, and passing --no-kept-objects=in-core to the revision machinery). This can cause the traversal to halt early, for example if an object in an included pack is an ancestor of ones in excluded packs. But stopping early is OK, since filling in the namehash fields of objects in the to_pack list is only additive (i.e., having it helps the delta selection process, but leaving it blank doesn't impact the correctness of the resulting pack). Even still, it is unlikely that this hurts us much in practice, since the 'git repack --geometric' caller (which is introduced in a later commit) marks small packs as included, and large ones as excluded. During ordinary use, the small packs usually represent pushes after a large repack, and so are unlikely to be ancestors of objects that already exist in the repository. (I found it convenient while developing this patch to have 'git pack-objects' report the number of objects which were visited and got their namehash fields filled in during traversal. This is also included in the below patch via trace2 data lines). Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Taylor Blau	c9fff00016	revision: learn '--no-kept-objects' A future caller will want to be able to perform a reachability traversal which terminates when visiting an object found in a kept pack. The closest existing option is '--honor-pack-keep', but this isn't quite what we want. Instead of halting the traversal midway through, a full traversal is always performed, and the results are only trimmed afterwords. Besides needing to introduce a new flag (since culling results post-facto can be different than halting the traversal as it's happening), there is an additional wrinkle handling the distinction in-core and on-disk kept packs. That is: what kinds of kept pack should stop the traversal? Introduce '--no-kept-objects[=<on-disk\|in-core>]' to specify which kinds of kept packs, if any, should stop a traversal. This can be useful for callers that want to perform a reachability analysis, but want to leave certain packs alone (for e.g., when doing a geometric repack that has some "large" packs which are kept in-core that it wants to leave alone). Note that this option is not guaranteed to produce exactly the set of objects that aren't in kept packs, since it's possible the traversal order may end up in a situation where a non-kept ancestor was "cut off" by a kept object (at which point we would stop traversing). But, we don't care about absolute correctness here, since this will eventually be used as a purely additive guide in an upcoming new repack mode. Explicitly avoid documenting this new flag, since it is only used internally. In theory we could avoid even adding it rev-list, but being able to spell this option out on the command-line makes some special cases easier to test without promising to keep it behaving consistently forever. Those tricky cases are exercised in t6114. Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Taylor Blau	f62312e028	packfile: introduce 'find_kept_pack_entry()' Future callers will want a function to fill a 'struct pack_entry' for a given object id but _only_ from its position in any kept pack(s). In particular, an new 'git repack' mode which ensures the resulting packs form a geometric progress by object count will mark packs that it does not want to repack as "kept in-core", and it will want to halt a reachability traversal as soon as it visits an object in any of the kept packs. But, it does not want to halt the traversal at non-kept, or .keep packs. The obvious alternative is 'find_pack_entry()', but this doesn't quite suffice since it only returns the first pack it finds, which may or may not be kept (and the mru cache makes it unpredictable which one you'll get if there are options). Short of that, you could walk over all packs looking for the object in each one, but it scales with the number of packs, which may be prohibitive. Introduce 'find_kept_pack_entry()', a function which is like 'find_pack_entry()', but only fills in objects in the kept packs. Handle packs which have .keep files, as well as in-core kept packs separately, since certain callers will want to distinguish one from the other. (Though on-disk and in-core kept packs share the adjective "kept", it is best to think of the two sets as independent.) There is a gotcha when looking up objects that are duplicated in kept and non-kept packs, particularly when the MIDX stores the non-kept version and the caller asked for kept objects only. This could be resolved by teaching the MIDX to resolve duplicates by always favoring the kept pack (if one exists), but this breaks an assumption in existing MIDXs, and so it would require a format change. The benefit to changing the MIDX in this way is marginal, so we instead have a more thorough check here which is explained with a comment. Callers will be added in subsequent patches. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Junio C Hamano	966e671106	The tenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 16:12:43 -08:00
Junio C Hamano	d68fccef86	Merge branch 'ab/test-lib' Test framework clean-up. * ab/test-lib: test-lib-functions: assert correct parameter count test-lib-functions: remove bug-inducing "diagnostics" helper param test libs: rename "diff-lib" to "lib-diff" t/.gitattributes: sort lines test-lib-functions: move function to lib-bitmap.sh test libs: rename gitweb-lib.sh to lib-gitweb.sh test libs: rename bundle helper to "lib-bundle.sh" test-lib-functions: remove generate_zero_bytes() wrapper test-lib-functions: move test_set_index_version() to its user test lib: change "error" to "BUG" as appropriate test-lib: remove check_var_migration	2021-02-22 16:12:43 -08:00
Junio C Hamano	45df6c4d75	Merge branch 'ab/diff-deferred-free' A small memleak in "diff -I<regexp>" has been corrected. * ab/diff-deferred-free: diff: plug memory leak from regcomp() on {log,diff} -I diff: add an API for deferred freeing	2021-02-22 16:12:43 -08:00
Junio C Hamano	dcb11fc622	Merge branch 'ab/pager-exit-log' When a pager spawned by us exited, the trace log did not record its exit status correctly, which has been corrected. * ab/pager-exit-log: pager: properly log pager exit code when signalled run-command: add braces for "if" block in wait_or_whine() pager: test for exit code with and without SIGPIPE pager: refactor wait_for_pager() function	2021-02-22 16:12:43 -08:00
Junio C Hamano	dc24948be9	Merge branch 'ta/hash-function-transition-doc' Update formatting and grammar of the hash transition plan documentation, plus some updates. * ta/hash-function-transition-doc: doc: use https links doc hash-function-transition: move rationale upwards doc hash-function-transition: fix incomplete sentence doc hash-function-transition: use upper case consistently doc hash-function-transition: use SHA-1 and SHA-256 consistently doc hash-function-transition: fix asciidoc output	2021-02-22 16:12:43 -08:00
Junio C Hamano	15af6e6fee	Merge branch 'bc/signed-objects-with-both-hashes' Signed commits and tags now allow verification of objects, whose two object names (one in SHA-1, the other in SHA-256) are both signed. * bc/signed-objects-with-both-hashes: gpg-interface: remove other signature headers before verifying ref-filter: hoist signature parsing commit: allow parsing arbitrary buffers with headers gpg-interface: improve interface for parsing tags commit: ignore additional signatures when parsing signed commits ref-filter: switch some uses of unsigned long to size_t	2021-02-22 16:12:42 -08:00
Junio C Hamano	b9554c03a0	Merge branch 'dl/stash-cleanup' Documentation, code and test clean-up around "git stash". * dl/stash-cleanup: stash: declare ref_stash as an array t3905: use test_cmp() to check file contents t3905: replace test -s with test_file_not_empty t3905: remove nested git in command substitution t3905: move all commands into test cases t3905: remove spaces after redirect operators git-stash.txt: be explicit about subcommand options	2021-02-22 16:12:42 -08:00
Andrzej Hunt	bf4bb9f9f5	commit-graph: avoid leaking topo_levels slab in write_commit_graph() write_commit_graph initialises topo_levels using init_topo_level_slab(), next it calls compute_topological_levels() which can cause the slab to grow, we therefore need to clear the slab again using clear_topo_level_slab() when we're done. First introduced in `72a2bfca` (commit-graph: add a slab to store topological levels, 2021-01-16). LeakSanitizer output: ==1026==ERROR: LeakSanitizer: detected memory leaks Direct leak of 8 byte(s) in 1 object(s) allocated from: #0 0x498ae9 in realloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0xafbed8 in xrealloc /src/git/wrapper.c:126:8 #2 0x7966d1 in topo_level_slab_at_peek /src/git/commit-graph.c:71:1 #3 0x7965e0 in topo_level_slab_at /src/git/commit-graph.c:71:1 #4 0x78fbf5 in compute_topological_levels /src/git/commit-graph.c:1472:12 #5 0x78c5c3 in write_commit_graph /src/git/commit-graph.c:2456:2 #6 0x535c5f in graph_write /src/git/builtin/commit-graph.c:299:6 #7 0x5350ca in cmd_commit_graph /src/git/builtin/commit-graph.c:337:11 #8 0x4cddb1 in run_builtin /src/git/git.c:453:11 #9 0x4cabe2 in handle_builtin /src/git/git.c:704:3 #10 0x4cd084 in run_argv /src/git/git.c:771:4 #11 0x4ca424 in cmd_main /src/git/git.c:902:19 #12 0x707fb6 in main /src/git/common-main.c:52:11 #13 0x7fee4249383f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2083f) Indirect leak of 524256 byte(s) in 1 object(s) allocated from: #0 0x498942 in calloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0xafc088 in xcalloc /src/git/wrapper.c:140:8 #2 0x796870 in topo_level_slab_at_peek /src/git/commit-graph.c:71:1 #3 0x7965e0 in topo_level_slab_at /src/git/commit-graph.c:71:1 #4 0x78fbf5 in compute_topological_levels /src/git/commit-graph.c:1472:12 #5 0x78c5c3 in write_commit_graph /src/git/commit-graph.c:2456:2 #6 0x535c5f in graph_write /src/git/builtin/commit-graph.c:299:6 #7 0x5350ca in cmd_commit_graph /src/git/builtin/commit-graph.c:337:11 #8 0x4cddb1 in run_builtin /src/git/git.c:453:11 #9 0x4cabe2 in handle_builtin /src/git/git.c:704:3 #10 0x4cd084 in run_argv /src/git/git.c:771:4 #11 0x4ca424 in cmd_main /src/git/git.c:902:19 #12 0x707fb6 in main /src/git/common-main.c:52:11 #13 0x7fee4249383f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2083f) SUMMARY: AddressSanitizer: 524264 byte(s) leaked in 2 allocation(s). Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 13:45:01 -08:00
ZheNing Hu	1c881026a1	difftool.c: learn a new way start at specified file `git difftool` only allow us to select file to view in turn. If there is a commit with many files and we exit in the middle, we will have to traverse list again to get the file diff which we want to see. Therefore,teach the command an option `--skip-to=<path>` to allow the user to say that diffs for earlier paths are not interesting (because they were already seen in an earlier session) and start this session with the named path. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 13:35:49 -08:00
Derrick Stolee	41f3c9949f	commit-reach: stale commits may prune generation further The remove_redundant_with_gen() algorithm performs a depth-first-search to find commits in the 'array' list, starting at the parents of each commit in 'array'. The result is that commits in 'array' are marked STALE when they are reachable from another commit in 'array'. This depth-first-search is fast when commits lie on or near the first-parent history of the higher commits. The search terminates early if all but one commit becomes marked STALE. However, it is possible that there are two independent commits with high generation number. In that case, the depth-first-search might languish by searching in lower generations due to the fixed min_generation used throughout the method. With the expectation that commits with lower generation are expected to become STALE more often, we can optimize further by increasing that min_generation boundary upon discovery of the commit with minimum generation. We must first sort the commits in 'array' by generation. We cannot sort 'array' itself since it must preserve relative order among the returned results (see revision.c:mark_redundant_parents() for an example). This simplifies the initialization of min_generation, but it also allows us to increase the new min_generation when we find the commit with smallest generation remaining. This requires more than two commits in order to test, so I used the Linux kernel repository with a few commits that are slightly off of the first-parent history. I timed the following command: git merge-base --independent 2ecedd756908 d2360a398f0b \ 1253935ad801 160bab43419e 0e2209629fec 1d0e16ac1a9e The first two commits have similar generation and are near the v5.10 tag. Commit 160bab43419e is off of the first-parent history behind v5.5, while the others are scattered somewhere reachable from v5.9. This is designed to demonstrate the optimization, as that commit within v5.5 would normally cause a lot of extra commit walking. Since remove_redundant_with_alg() is called only when at least one of the input commits has a finite generation number, this algorithm is tested with a commit-graph generated starting at a number of different tags, the earliest being v5.5. commit-graph at v5.5: \| Method \| Time \| \|-----------------------+-------\| \| _no_gen() \| 864ms \| \| _with_gen() (before) \| 858ms \| \| _with_gen() (after) \| 810ms \| commit-graph at v5.7: \| Method \| Time \| \|-----------------------+-------\| \| _no_gen() \| 625ms \| \| _with_gen() (before) \| 572ms \| \| _with_gen() (after) \| 517ms \| commit-graph at v5.9: \| Method \| Time \| \|-----------------------+-------\| \| _no_gen() \| 268ms \| \| _with_gen() (before) \| 224ms \| \| _with_gen() (after) \| 202ms \| commit-graph at v5.10: \| Method \| Time \| \|-----------------------+-------\| \| _no_gen() \| 72ms \| \| _with_gen() (before) \| 37ms \| \| _with_gen() (after) \| 9ms \| Note that these are only modest improvements for the case where the two independent commits are not in the commit-graph (not until v5.10). All algorithms get faster as more commits are indexed, which is not a surprise. However, the cost of walking extra commits is more and more prevalent in relative terms as more commits are indexed. Finally, the last case allows us to jump to the minimum generation between the last two commits (that are actually independent) so we greatly reduce the cost in that case. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 13:34:34 -08:00
Derrick Stolee	3677773371	commit-reach: use heuristic in remove_redundant() Reachability algorithms in commit-reach.c frequently benefit from using the first-parent history as a heuristic for satisfying reachability queries. The most obvious example was implemented in `4fbcca4e` (commit-reach: make can_all_from_reach... linear, 2018-07-20). Update the walk in remove_redundant() to use this same heuristic. Here, we are walking starting at the parents of the input commits. Sort those parents and walk from the highest generation to lower. Each time, use the heuristic of searching the first parent history before continuing to expand the walk. The order in which we explore the commits matters, so update compare_commits_by_gen to break generation number ties with commit date. This has no effect when the commits are in a commit-graph file with corrected commit dates computed, but it will assist when the commits are in the region "above" the commit-graph with "infinite" generation number. Note that we cannot shift to use compare_commits_by_gen_then_commit_date as the method prototype is different. We use compare_commits_by_gen for QSORT() as opposed to as a priority function. The important piece is to ensure we short-circuit the walk when we find that there is a single non-redundant commit. This happens frequently when looking for merge-bases or comparing several tags with 'git merge-base --independent'. Use a new count 'count_still_independent' and if that hits 1 we can stop walking. To update 'count_still_independent' properly, we add use of the RESULT flag on the input commits. Then we can detect when we reach one of these commits and decrease the count. We need to remove the RESULT flag at that moment because we might re-visit that commit when popping the stack. We use the STALE flag to mark parents that have been added to the new walk_start list, but we need to clear that flag before we start walking so those flags don't halt our depth-first-search walk. On my copy of the Linux kernel repository, the performance of 'git merge-base --independent <all-tags>' goes from 1.1 seconds to 0.11 seconds. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 13:34:34 -08:00
Derrick Stolee	c8d693e1e6	commit-reach: move compare_commits_by_gen Move this earlier in the file so it can be used by more methods. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 13:34:34 -08:00
Derrick Stolee	fbc21e3fbb	commit-reach: use one walk in remove_redundant() The current implementation of remove_redundant() uses several calls to paint_down_to_common() to determine that commits are independent of each other. This leads to quadratic behavior when many inputs are passed to commands such as 'git merge-base'. For example, in the Linux kernel repository, I tested the performance by passing all tags: git merge-base --independent $(git for-each-ref refs/tags --format="$(refname)") (Note: I had to delete the tags v2.6.11-tree and v2.6.11 as they do not point to commits.) Here is the performance improvement introduced by this change: Before: 16.4s After: 1.1s This performance improvement requires the commit-graph file to be present. We keep the old algorithm around as remove_redundant_no_gen() and use it when generation_numbers_enabled() is false. This is similar to other algorithms within commit-reach.c. The new algorithm is implemented in remove_redundant_with_gen(). The basic approach is to do one commit walk instead of many. First, scan all commits in the list and mark their _parents_ with the STALE flag. This flag will indicate commits that are reachable from one of the inputs, except not including themselves. Then, walk commits until covering all commits up to the minimum generation number pushing the STALE flag throughout. At the end, we need to clear the STALE bit from all of the commits we walked. We move the non-stale commits in 'array' to the beginning of the list, and this might overwrite stale commits. However, we store an array of commits that started the walk, and use clear_commit_marks() on each of those starting commits. That method will walk the reachable commits with the STALE bit and clear them all. This makes the algorithm safe for re-entry or for other uses of those commits after this walk. This logic is covered by tests in t6600-test-reach.sh, so the behavior does not change. This is tested both in the case with a commit-graph and without. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 13:34:34 -08:00
Christian Walther	3a837b58e3	doc: mention bigFileThreshold for packing Knowing about the core.bigFileThreshold configuration variable is helpful when examining pack file size differences between repositories. Add a reference to it to the manpages a user is likely to read in this situation. Capitalize CONFIGURATION for consistency with other pages having such a section. Signed-off-by: Christian Walther <cwalther@gmx.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 13:18:30 -08:00
Jonathan Tan	5476e1efde	fetch-pack: print and use dangling .gitmodules Teach index-pack to print dangling .gitmodules links after its "keep" or "pack" line instead of declaring an error, and teach fetch-pack to check such lines printed. This allows the tree side of the .gitmodules link to be in one packfile and the blob side to be in another without failing the fsck check, because it is now fetch-pack which checks such objects after all packfiles have been downloaded and indexed (and not index-pack on an individual packfile, as it is before this commit). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 12:07:40 -08:00
Jonathan Tan	b664e9ffa1	fetch-pack: with packfile URIs, use index-pack arg Unify the index-pack arguments used when processing the inline pack and when downloading packfiles referenced by URIs. This is done by teaching get_pack() to also store the index-pack arguments whenever at least one packfile URI is given, and then when processing the packfile URI(s), using the stored arguments. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 12:07:40 -08:00
Jonathan Tan	27e35ba6c6	http-fetch: allow custom index-pack args This is the next step in teaching fetch-pack to pass its index-pack arguments when processing packfiles referenced by URIs. The "--keep" in fetch-pack.c will be replaced with a full message in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 12:07:40 -08:00
Jonathan Tan	726b25a91b	http: allow custom index-pack args Currently, when fetching, packfiles referenced by URIs are run through index-pack without any arguments other than --stdin and --keep, no matter what arguments are used for the packfile that is inline in the fetch response. As a preparation for ensuring that all packs (whether inline or not) use the same index-pack arguments, teach the http subsystem to allow custom index-pack arguments. http-fetch has been updated to use the new API. For now, it passes --keep alone instead of --keep with a process ID, but this is only temporary because http-fetch itself will be taught to accept index-pack parameters (instead of using a hardcoded constant) in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 12:07:40 -08:00
Pratyush Yadav	b1056f60b6	Merge branch 'py/commit-comments' Use git-stripspace to remove comment lines from the commit message. Also use it to clean up whitespace instead of rolling our own logic. * py/commit-comments: git-gui: remove lines starting with the comment character	2021-02-22 20:19:53 +05:30
Junio C Hamano	1b5b8cf072	Documentation: typofix --column description `f4ed0af6` (Merge branch 'nd/columns', 2012-05-03) brought in three cut-and-pasted copies of malformatted descriptions. Let's fix them all the same way by marking the configuration variable names up as monospace just like the command line option `--column` is typeset. While we are at it, correct a missing space after the full stop that ends the sentence. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-19 19:36:47 -08:00
Derrick Stolee	a43a2e6c2a	chunk-format: add technical docs The chunk-based file format is now an API in the code, but we should also take time to document it as a file format. Specifically, it matches the CHUNK LOOKUP sections of the commit-graph and multi-pack-index files, but there are some commonalities that should be grouped in this document. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	5387fefadc	chunk-format: restore duplicate chunk checks Before refactoring into the chunk-format API, the commit-graph parsing logic included checks for duplicate chunks. It is unlikely that we would desire a chunk-based file format that allows duplicate chunk IDs in the table of contents, so add duplicate checks into read_table_of_contents(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	329fac3a36	midx: use 64-bit multiplication for chunk sizes When calculating the sizes of certain chunks, we should use 64-bit multiplication always. This allows us to properly predict the chunk sizes without risk of overflow. Other possible overflows were discovered by evaluating each multiplication in midx.c and ensuring that at least one side of the operator was of type size_t or off_t. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	6ab3b8b8b8	midx: use chunk-format read API Instead of parsing the table of contents directly, use the chunk-format API methods read_table_of_contents() and pair_chunk(). In particular, we can use the return value of pair_chunk() to generate an error when a required chunk is missing. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	2692c2f6fd	commit-graph: use chunk-format read API Instead of parsing the table of contents directly, use the chunk-format API methods read_table_of_contents() and pair_chunk(). While the current implementation loses the duplicate-chunk detection, that will be added in a future change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	5f0879f54b	chunk-format: create read chunk API Add the capability to read the table of contents, then pair the chunks with necessary logic using read_chunk_fn pointers. Callers will be added in future changes, but the typical outline will be: 1. initialize a 'struct chunkfile' with init_chunkfile(NULL). 2. call read_table_of_contents(). 3. for each chunk to parse, a. call pair_chunk() to assign a pointer with the chunk position, or b. call read_chunk() to run a callback on the chunk start and size. 4. call free_chunkfile() to clear the 'struct chunkfile' data. We are re-using the anonymous 'struct chunkfile' data, as it is internal to the chunk-format API. This gives it essentially two modes: write and read. If the same struct instance was used for both reads and writes, then there would be failures. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	63a8f0e9b9	midx: use chunk-format API in write_midx_internal() The chunk-format API allows writing the table of contents and all chunks using the anonymous 'struct chunkfile' type. We only need to convert our local chunk logic to this API for the multi-pack-index writes to share that logic with the commit-graph file writes. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	c1442410d8	midx: drop chunk progress during write Most expensive operations in write_midx_internal() use the context struct's progress member, and these indicate the process of the expensive operations within the chunk writing methods. However, there is a competing progress struct that counts the progress over all chunks. This is not very helpful compared to the others, so drop it. This also reduces our barriers to combining the chunk writing code with chunk-format.c. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	0ccd713cb6	midx: return success/failure in chunk write methods Historically, the chunk-writing methods in midx.c have returned the amount of data written so the writer method could compare this with the table of contents. This presents with some interesting issues: 1. If a chunk writing method has a bug that miscalculates the written bytes, then we can satisfy the table of contents without actually writing the right amount of data to the hashfile. The commit-graph writing code checks the hashfile struct directly for a more robust verification. 2. There is no way for a chunk writing method to gracefully fail. Returning an int presents an opportunity to fail without a die(). 3. The current pattern doesn't match chunk_write_fn type exactly, so we cannot share code with commit-graph.c For these reasons, convert the midx chunk writer methods to return an 'int'. Since none of them fail at the moment, they all return 0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	980f525c3c	midx: add num_large_offsets to write_midx_context In an effort to align write_midx_internal() with the chunk-format API, continue to group necessary data into "struct write_midx_context". This change collects the "uint32_t num_large_offsets" into the context. With this new data, write_midx_large_offsets() now matches the chunk_write_fn type. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	7a3ada1192	midx: add pack_perm to write_midx_context In an effort to align write_midx_internal() with the chunk-format API, continue to group necessary data into "struct write_midx_context". This change collects the "uint32_t *pack_perm" and large_offsets_needed bit into the context. Update write_midx_object_offsets() to match chunk_write_fn. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	31bda9a237	midx: add entries to write_midx_context In an effort to align write_midx_internal() with the chunk-format API, continue to group necessary data into "struct write_midx_context". This change collects the "struct pack_midx_entry *entries" list and its count into the context. Update write_midx_oid_fanout() and write_midx_oid_lookup() to take the context directly, as these are easy conversions with this new data. Only the callers of write_midx_object_offsets() and write_midx_large_offsets() are updated here, since additional data in the context before those methods can match chunk_write_fn. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	b4d941420b	midx: use context in write_midx_pack_names() In an effort to align the write_midx_internal() to use the chunk-format API, start converting chunk writing methods to match chunk_write_fn. The first case is to convert write_midx_pack_names() to take "void *data". We already have the necessary data in "struct write_midx_context", so this conversion is rather mechanical. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	577dc49696	midx: rename pack_info to write_midx_context In an effort to streamline our chunk-based file formats, align some of the code structure in write_midx_internal() to be similar to the patterns in write_commit_graph_file(). Specifically, let's create a "struct write_midx_context" that can be used as a data parameter to abstract function types. This change only renames "struct pack_info" to "struct write_midx_context" and the names of instances from "packs" to "ctx". In future changes, we will expand the data inside "struct write_midx_context" and align our chunk-writing method with the chunk-format API. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	47410aa837	commit-graph: use chunk-format write API The commit-graph write logic is ready to make use of the chunk-format write API. Each chunk write method is already in the correct prototype. We only need to use the 'struct chunkfile' pointer and the correct API calls. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Derrick Stolee	570df42610	chunk-format: create chunk format write API In anticipation of combining the logic from the commit-graph and multi-pack-index file formats, create a new chunk-format API. Use a 'struct chunkfile' pointer to keep track of data that has been registered for writes. This struct is anonymous outside of chunk-format.c to ensure no user attempts to interfere with the data. The next change will use this API in commit-graph.c, but the general approach is: 1. initialize the chunkfile with init_chunkfile(f). 2. add chunks in the intended writing order with add_chunk(). 3. write any header information to the hashfile f. 4. write the chunkfile data using write_chunkfile(). 5. free the chunkfile struct using free_chunkfile(). Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Martin Ågren	f89f46b704	gitmailmap.txt: fix rendering of e-mail addresses Both AsciiDoc and Asciidoctor are eager to pick up the e-mail addresses in this document and turn them into references at the bottom of the manpage / clickable links. We don't really need that for these dummy addresses. Spell "@" as "@" to make them not do this. In the open block, we can instead avoid this by indenting the contents, similar to the earlier blocks. Fix a backtick which should have been a single quote mark. With all the quoting that is going on around here, this mistake trips up the parsing and rendering quite a bit. Before this commit, we have the same failure mode with AsciiDoc 8.6.10 and Asciidoctor 1.5.5, and this change makes both of them happy. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 10:53:33 -08:00
Martin Ågren	83171ede22	git.txt: fix monospace rendering When we write `<name>`s with the "s" tucked on to the closing backtick, we end up rendering the backticks literally. Rephrase this sentence slightly to render this as monospace. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 10:53:33 -08:00
Pratyush Yadav	b9a43869c9	git-gui: remove lines starting with the comment character The comment character is specified by the config variable 'core.commentchar'. Any lines starting with this character is considered a comment and should not be included in the final commit message. Teach git-gui to filter out lines in the commit message that start with the comment character using git-stripspace. If the config is not set, '#' is taken as the default. Also add a message educating users about the comment character. Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>	2021-02-18 23:35:57 +05:30
Junio C Hamano	2283e0e9af	The ninth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 17:21:43 -08:00
Junio C Hamano	483e09e810	Merge branch 'ak/config-bad-bool-error' The error message given when a configuration variable that is expected to have a boolean value has been improved. * ak/config-bad-bool-error: config: improve error message for boolean config	2021-02-17 17:21:43 -08:00
Junio C Hamano	e68f62be8d	Merge branch 'js/reflog-expire-stale-fix' "git reflog expire --stale-fix" can be used to repair the reflog by removing entries that refer to objects that have been pruned away, but was not careful to tolerate missing objects. * js/reflog-expire-stale-fix: reflog expire --stale-fix: be generous about missing objects	2021-02-17 17:21:43 -08:00
Junio C Hamano	726b11d68a	Merge branch 'js/commit-graph-warning' When certain features (e.g. grafts) used in the repository are incompatible with the use of the commit-graph, we used to silently turned commit-graph off; we now tell the user what we are doing. * js/commit-graph-warning: commit-graph: when incompatible with graphs, indicate why	2021-02-17 17:21:42 -08:00
Junio C Hamano	e9b4c483c7	Merge branch 'ew/rev-parse-since-test' Test to make sure "git rev-parse one-thing one-thing" gives the same thing twice (when one-thing is --since=X). * ew/rev-parse-since-test: t1500: ensure current --since= behavior remains	2021-02-17 17:21:42 -08:00
Junio C Hamano	d494433d26	Merge branch 'ds/maintenance-pack-refs' "git maintenance" tool learned a new "pack-refs" maintenance task. * ds/maintenance-pack-refs: maintenance: incremental strategy runs pack-refs weekly maintenance: add pack-refs task	2021-02-17 17:21:42 -08:00
Junio C Hamano	fdf3a27ca9	Merge branch 'jx/t5411-unique-filenames' Avoid individual tests in t5411 from getting affected by each other by forcing them to use separate output files during the test. * jx/t5411-unique-filenames: t5411: refactor check of refs using test_cmp_refs t5411: use different out file to prevent overwriting	2021-02-17 17:21:42 -08:00
Junio C Hamano	9e634a91c8	Merge branch 'js/fsck-name-objects-fix' Fix "git fsck --name-objects" which apparently has not been used by anybody who is motivated enough to report breakage. * js/fsck-name-objects-fix: fsck --name-objects: be more careful parsing generation numbers t1450: robustify `remove_object()`	2021-02-17 17:21:42 -08:00
Junio C Hamano	9bdccbcda7	Merge branch 'jk/mailmap-only-at-root' The .mailmap is documented to be read only from the root level of a working tree, but a stray file in a bare repository also was read by accident, which has been corrected. * jk/mailmap-only-at-root: mailmap: only look for .mailmap in work tree	2021-02-17 17:21:42 -08:00
Junio C Hamano	f712632a51	Merge branch 'mt/grep-cached-untracked' "git grep --untracked" is meant to be "let's ALSO find in these files on the filesystem" when looking for matches in the working tree files, and does not make any sense if the primary search is done against the index, or the tree objects. The "--cached" and "--untracked" options have been marked as mutually incompatible. * mt/grep-cached-untracked: grep: error out if --untracked is used with --cached	2021-02-17 17:21:41 -08:00
Junio C Hamano	78a26cb720	Merge branch 'sh/mergetool-hideresolved' "git mergetool" feeds three versions (base, local and remote) of a conflicted path unmodified. The command learned to optionally prepare these files with unconflicted parts already resolved. * sh/mergetool-hideresolved: mergetool: add per-tool support and overrides for the hideResolved flag mergetool: break setup_tool out into separate initialization function mergetool: add hideResolved configuration	2021-02-17 17:21:41 -08:00
Junio C Hamano	aa2d3dbdf5	Merge branch 'jt/trace2-BUG' Even though invocations of "die()" were logged to the trace2 system, "BUG()"s were not, which has been corrected. * jt/trace2-BUG: usage: trace2 BUG() invocations	2021-02-17 17:21:41 -08:00
Junio C Hamano	dadc91ff0c	Merge branch 'js/range-diff-one-side-only' The "git range-diff" command learned "--(left\|right)-only" option to show only one side of the compared range. * js/range-diff-one-side-only: range-diff: offer --left-only/--right-only options range-diff: move the diffopt initialization down one layer range-diff: combine all options in a single data structure range-diff: simplify code spawning `git log` range-diff: libify the read_patches() function again range-diff: avoid leaking memory in two error code paths	2021-02-17 17:21:41 -08:00
Junio C Hamano	77348b0e6e	Merge branch 'js/range-diff-wo-dotdot' There are other ways than ".." for a single token to denote a "commit range", namely "<rev>^!" and "<rev>^-<n>", but "git range-diff" did not understand them. * js/range-diff-wo-dotdot: range-diff(docs): explain how to specify commit ranges range-diff/format-patch: handle commit ranges other than A..B range-diff/format-patch: refactor check for commit range	2021-02-17 17:21:41 -08:00
Junio C Hamano	69571dfe21	Merge branch 'jt/clone-unborn-head' "git clone" tries to locally check out the branch pointed at by HEAD of the remote repository after it is done, but the protocol did not convey the information necessary to do so when copying an empty repository. The protocol v2 learned how to do so. * jt/clone-unborn-head: clone: respect remote unborn HEAD connect, transport: encapsulate arg in struct ls-refs: report unborn targets of symrefs	2021-02-17 17:21:40 -08:00
Junio C Hamano	0871fb9af5	Merge branch 'mr/bisect-in-c-4' Piecemeal of rewrite of "git bisect" in C continues. * mr/bisect-in-c-4: bisect--helper: retire `--check-and-set-terms` subcommand bisect--helper: reimplement `bisect_skip` shell function in C bisect--helper: retire `--bisect-auto-next` subcommand bisect--helper: use `res` instead of return in BISECT_RESET case option bisect--helper: retire `--bisect-write` subcommand bisect--helper: reimplement `bisect_replay` shell function in C bisect--helper: reimplement `bisect_log` shell function in C	2021-02-17 17:21:40 -08:00
Junio C Hamano	5bd0b21bf7	Merge branch 'ds/commit-graph-genno-fix' Fix incremental update of commit-graph file around corrected commit date data. * ds/commit-graph-genno-fix: commit-graph: prepare commit graph commit-graph: be extra careful about mixed generations commit-graph: compute generations separately commit-graph: validate layers for generation data commit-graph: always parse before commit_graph_data_at() commit-graph: use repo_parse_commit	2021-02-17 17:21:40 -08:00
Junio C Hamano	8b4701ae4f	Merge branch 'ak/corrected-commit-date' The commit-graph learned to use corrected commit dates instead of the generation number to help topological revision traversal. * ak/corrected-commit-date: doc: add corrected commit date info commit-reach: use corrected commit dates in paint_down_to_common() commit-graph: use generation v2 only if entire chain does commit-graph: implement generation data chunk commit-graph: implement corrected commit date commit-graph: return 64-bit generation number commit-graph: add a slab to store topological levels t6600-test-reach: generalize *_three_modes commit-graph: consolidate fill_commit_graph_info revision: parse parent in indegree_walk_step() commit-graph: fix regression when computing Bloom filters	2021-02-17 17:21:40 -08:00
Ævar Arnfjörð Bjarmason	c1760352e0	grep/pcre2: move definitions of pcre2_{malloc,free} Move the definitions of the pcre2_{malloc,free} functions above the compile_pcre2_pattern() function they're used in. Before the preceding commit they used to be needed earlier, but now we can move them to be adjacent to the other PCREv2 functions. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:19 -08:00
Ævar Arnfjörð Bjarmason	cbe81e653f	grep/pcre2: move back to thread-only PCREv2 structures Change the setup of the "pcre2_general_context" to happen per-thread in compile_pcre2_pattern() instead of in grep_init(). This change brings it in line with how the rest of the pcre2_* members in the grep_pat structure are set up. As noted in the preceding commit the approach `513f2b0bbd` (grep: make PCRE2 aware of custom allocator, 2019-10-16) took to allocate the pcre2_general_context seems to have been initially based on a misunderstanding of how PCREv2 memory allocation works. The approach of creating a global context in grep_init() is just added complexity for almost zero gain. On my system it's 24 bytes saved per-thread. For comparison PCREv2 will then go on to allocate at least a kilobyte for its own thread-local state. As noted in `6d423dd542` (grep: don't redundantly compile throwaway patterns under threading, 2017-05-25) the grep code is intentionally not trying to micro-optimize allocations by e.g. sharing some PCREv2 structures globally, while making others thread-local. So let's remove this special case and make all of them thread-local again for simplicity. With this change we could move the pcre2_{malloc,free} functions around to live closer to their current use. I'm not doing that here to keep this change small, that cleanup will be done in a follow-up commit. See also the discussion in `94da9193a6` (grep: add support for PCRE v2, 2017-06-01) about thread safety, and Johannes's comments[1] to the effect that we should be doing what this patch is doing. 1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:19 -08:00
Ævar Arnfjörð Bjarmason	8d12851342	grep/pcre2: actually make pcre2 use custom allocator Continue work started in `513f2b0bbd` (grep: make PCRE2 aware of custom allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}(). functions for allocation. We'll now use it for all PCREv2 allocations. The reason `513f2b0bbd` worked as a bugfix for the USE_NED_ALLOCATOR issue is because it targeted the allocation freed via free(), as opposed to by a pcre2_free() function. I.e. the pcre2_maketables() and pcre2_maketables_free() pair. For most of the rest we continued allocating with stock malloc() inside PCREv2 itself, but didn't segfault because we'd use its corresponding free(). In a preceding commit of mine I changed the free() to pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as far as fixing the segfault goes we could revert `513f2b0bbd`. But then we wouldn't use the desired allocator, let's just use it instead. Before this patch we'd on e.g.: grep --threads=1 -iP æ.var.*xyz Only use pcre2_{malloc,free}() for 2 malloc() calls and 2 corresponding free() calls. Now it's 12 calls to each. This can be observed with the GREP_PCRE2_DEBUG_MALLOC debug mode. Reading the history of how this bug got introduced it wasn't present in Johannes's original patch[1] to fix the issue. My reading of that thread is that the approach the follow-up patches to Johannes's original pursued were based on misunderstanding of how the PCREv2 API works. In particular this part of [2]: "most of the time (like when using UTF-8) the chartable (and therefore the global context) is not needed (even when using alternate allocators)" That's simply not how PCREv2 memory allocation works. It's easy to see how the misunderstanding came about. It's because (as noted above) the issue was noticed because of our use of free() in our own grep.c for freeing the memory allocated by pcre2_maketables(). Thus the misunderstanding that PCREv2's compile context is something only needed for pcre2_maketables(), and e.g. an aborted earlier attempt[3] to only set it up when we ourselves called pcre2_maketables(). That's not what PCREv2's compile context is. To quote PCREv2's documentation: "This context just contains pointers to (and data for) external memory management functions that are called from several places in the PCRE2 library." Thus the failed attempts to go down the route of only creating the general context in cases where we ourselves call pcre2_maketables(), before finally settling on the approach `513f2b0bbd` took of always creating it, but then mostly not using it. Instead we should always create it, and then pass the general context to those functions that accept it, so that they'll consistently use our preferred memory allocation functions. 1. https://lore.kernel.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/ 2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/ 3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:19 -08:00
Ævar Arnfjörð Bjarmason	b76bf27f6a	grep/pcre2: use pcre2_maketables_free() function Make use of the pcre2_maketables_free() function to free the memory allocated by pcre2_maketables(). At first sight it's strange that `10da030ab7` (grep: avoid leak of chartables in PCRE2, 2019-10-16) which added the free() call here doesn't make use of the pcre2_free() the author introduced in the preceding commit in `513f2b0bbd` (grep: make PCRE2 aware of custom allocator, 2019-10-16). The reason is that at the time the function didn't exist. It was first introduced in PCREv2 version 10.34, released on 2019-11-21. Let's make use of it behind a macro. I don't think this matters for anything to do with custom allocators, but it makes our use of PCREv2 more discoverable. At some distant point in the future we'll be able to drop the version guard, as nobody will be running a version older than 10.34. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:19 -08:00
Ævar Arnfjörð Bjarmason	797c359978	grep/pcre2: use compile-time PCREv2 version test Replace a use of pcre2_config(PCRE2_CONFIG_VERSION, ...) which I added in `95ca1f987e` (grep/pcre2: better support invalid UTF-8 haystacks, 2021-01-24) with the same test done at compile-time. It might be cuter to do this at runtime since we don't have to do the "major >= 11 \|\| (major >= 10 && ...)" test. But in the next commit we'll add another version comparison that absolutely needs to be done at compile-time, so we're better of being consistent across the board. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:19 -08:00
Ævar Arnfjörð Bjarmason	a39b4003f0	grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Add optional printing of PCREv2 allocations to stderr for a developer who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to "1". You need to manually change the definition in the source file similar to the DEBUG_MAILMAP, there's no Makefile knob for this. This will be referenced a subsequent commit, and is generally useful to manually see what's going on with PCREv2 allocations while working on that code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:19 -08:00
Ævar Arnfjörð Bjarmason	588e4fb191	grep/pcre2: prepare to add debugging to pcre2_malloc() Change pcre2_malloc() in a way that'll make it easier for a debugging fprintf() to spew out the allocated pointer. This doesn't introduce any functional change, it just makes a subsequent commit's diff easier to read. Changes code added in `513f2b0bbd` (grep: make PCRE2 aware of custom allocator, 2019-10-16). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:19 -08:00
Ævar Arnfjörð Bjarmason	47eebd2fd2	grep/pcre2: correct reference to grep_init() in comment Correct a comment added in `513f2b0bbd` (grep: make PCRE2 aware of custom allocator, 2019-10-16). This comment was never correct in git.git, but was consistent with an older version of the patch[1]. 1. https://lore.kernel.org/git/20190806163658.66932-3-carenas@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:18 -08:00
Ævar Arnfjörð Bjarmason	1cfc5a850c	grep/pcre2: drop needless assignment to NULL Remove a redundant assignment of pcre2_compile_context dating back to my `94da9193a6` (grep: add support for PCRE v2, 2017-06-01). In create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no need to NULL out individual members here. I think this was probably something left over from an earlier development version of mine. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:18 -08:00
Ævar Arnfjörð Bjarmason	0ddf8ceac0	grep/pcre2: drop needless assignment + assert() on opt->pcre2 Drop an assignment added in `b65abcafc7` (grep: use PCRE v2 for optimized fixed-string search, 2019-07-01) and the overly cautious assert() I added in `94da9193a6` (grep: add support for PCRE v2, 2017-06-01). There was never a good reason for this, it's just a relic from when I initially wrote the PCREv2 support. We're not going to have confusion about compile_pcre2_pattern() being called when it shouldn't just because we forgot to cargo-cult this opt->pcre2 option. Furthermore the "struct grep_opt" is (mostly) used for the options the user supplied, let's avoid the pattern of needlessly assigning to it. With my recent removal of the PCREv1 backend in `7599730b7e` (Remove support for v1 of the PCRE library, 2021-01-24) there's even less confusion around what we call where in these codepaths, which is one more reason to remove this. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:18 -08:00
Joey Salazar	9d336655ba	doc: fix naming of response-end-pkt Git Protocol version 2[1] defines 0002 as a Message Packet that indicates the end of a response for stateless connections. Change the naming of the 0002 Packet to 'Response End' to match the parsing introduced in Wireshark's MR !1922 for consistency. A subsequent MR in Wireshark will address additional mismatches. [1] kernel.org/pub/software/scm/git/docs/technical/protocol-v2.html [2] gitlab.com/wireshark/wireshark/-/merge_requests/1922 Signed-off-by: Joey Salazar <jgsal@protonmail.com> Reviewed-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:30:43 -08:00
Jeff King	a1db097e10	docs/rev-list: add some examples of --disk-usage It's not immediately obvious why --disk-usage might be a useful thing. These examples show off a few of the real-world cases I've used it for. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:25:29 -08:00
Jeff King	669b458755	docs/rev-list: add an examples section We currently don't show any examples of using git-rev-list at all. Let's add some pretty elementary examples. They likely seem obvious to anybody who has worked with the tool for a while, but my purpose here is two-fold: - they may be enlightening to people who haven't used the tool a lot to give a general flavor of how it is meant to be used - they can serve as a starting point for adding more interesting examples (we can do that without the basic ones, of course, but I think it makes sense to show off the building blocks) This set is far from exhaustive, but again, the purpose is to be a starting point for further additions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:22:13 -08:00
Martin Ågren	452d26448d	rev-list-options.txt: fix rendering of bonus paragraph In git-log(1) -- but not in git-shortlog(1) or git-rev-list(1) -- we include a bonus paragraph in the description of `--first-parent`. But we forgot to add a lone "+" for a list continuation, and we shouldn't be indenting this second paragraph. As a result, we get a different indentation and the `backticks` render literally. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 13:16:11 -08:00
Rafael Silva	8e16effe97	blame: remove unnecessary use of get_commit_info() When `git blame --color-by-age`, the determine_line_heat() is called to select how to color the output based on the commit's author date. It uses the get_commit_info() to parse the information into a `commit_info` structure, however, this is actually unnecessary because the determine_line_heat() caller also does the same. Instead, let's change the determine_line_heat() to take a `commit_info` structure and remove the internal call to get_commit_info() thus cleaning up and optimizing the code path. Enabling Git's trace2 API in order to record the execution time for every call to determine_line_heat() function: + trace2_region_enter("blame", "determine_line_heat", the_repository); determine_line_heat(ent, &default_color); + trace2_region_enter("blame", "determine_line_heat", the_repository); Then, running `git blame` for "kernel/fork.c" in linux.git and summing all the execution time for every call (around 1.3k calls) resulted in 2.6x faster execution (best out 3): git built from `328c109303` (The eighth batch, 2021-02-12) = 42ms git built from `328c109303` + this change = 16ms Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 11:04:17 -08:00
René Scharfe	b081547ec1	pretty: add merge and exclude options to %(describe) Allow restricting the tags used by the placeholder %(describe) with the options match and exclude. E.g. the following command describes the current commit using official version tags, without those for release candidates: $ git log -1 --format='%(describe:match=v[0-9],exclude=rc*)' Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 09:54:33 -08:00
René Scharfe	15ae82d5d6	pretty: add %(describe) Add a format placeholder for describe output. Implement it by actually calling git describe, which is simple and guarantees correctness. It's intended to be used with $Format:...$ in files with the attribute export-subst and git archive. It can also be used with git log etc., even though that's going to be slow due to the fork for each commit. Suggested-by: Eli Schwartz <eschwartz@archlinux.org> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 09:54:31 -08:00
Jeff Hostetler	fcd19b09f8	fsmonitor: refactor initialization of fsmonitor_last_update token Isolate and document initialization of `istate->fsmonitor_last_update`. This field should contain a fsmonitor-specific opaque token, but we need to initialize it before we can actually talk to a fsmonitor process, so we create a generic default value. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:35 -08:00
Kevin Willford	ff03836b9d	fsmonitor: allow all entries for a folder to be invalidated Allow fsmonitor to report directory changes by reporting paths with a trailing slash. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:35 -08:00
Jeff Hostetler	29fbbf43a0	fsmonitor: log FSMN token when reading and writing the index Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:35 -08:00
Jeff Hostetler	940b94f35c	fsmonitor: log invocation of FSMonitor hook to trace2 Let's measure the time taken to request and receive FSMonitor data via the hook API and the size of the response. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	15268d12be	read-cache: log the number of scanned files to trace2 Report the number of files in the working directory that were read and their hashes verified in `refresh_index()`. FSMonitor improves the performance of commands like `git status` by avoiding scanning the disk for changed files. Let's measure this. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	a98e0f2d31	read-cache: log the number of lstat calls to trace2 Report the total number of calls made to lstat() inside of refresh_index(). FSMonitor improves the performance of commands like `git status` by avoiding scanning the disk for changed files. This can be seen in `refresh_index()`. Let's measure this. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	8c4b7503d0	preload-index: log the number of lstat calls to trace2 Report the total number of calls made to lstat() inside preload_index(). FSMonitor improves the performance of commands like `git status` by avoiding scanning the disk for changed files. This can be seen in `preload_index()`. Let's measure this. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	4f2009dce2	p7519: add trace logging during perf test Add optional trace logging to allow us to better compare performance of various fsmonitor providers and compare results with non-fsmonitor runs. Currently, this includes Trace2 logging, but may be extended to include other trace targets, such as GIT_TRACE_FSMONITOR if desired. Using this logging helped me explain an odd behavior on MacOS where the kernel was dropping events and causing the hook to Watchman to timeout. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	a7556c3bde	p7519: move watchman cleanup earlier in the test Shutdown Watchman after the Watchman-based tests and before the block of "no fsmonitor" tests. This helps ensure that Watchman cannot affect the test results for the other. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	0917763d67	p7519: fix watchman watch-list test on Windows Only use the final portion of the test trash directory file name when verifying that Watchman was started. On Windows and under the SDK, $GIT_WORKTREE is a cygwin-style path with forward slashes and a "/c/" drive name. However `watchman watch-list` reports a proper Windows-style pathname with drive letters and backslashes. This causes the grep to fail. Since we don't really care about the full pathname (and we really don't want to bother with normalizaing them), just see if the test-name portion of the path is found. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	eb10e637cf	p7519: do not rely on "xargs -d" in test Convert the test to use a more portable method to update the mtime on a large number of files under version control. The Mac version of xargs does not support the "-d" option. Likewise, the "-0" and "--null" options are not portable. Furthermore, use `test-tool chmtime` rather than `touch` to update the mtime to ensure that it is actually updated (especially on file systems with only whole second resolution). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Matheus Tavares	3f7ba60350	checkout-index: omit entries with no tempname from --temp output With --temp (or --stage=all, which implies --temp), checkout-index writes a list to stdout associating temporary file names to the entries' names. But if it fails to write an entry, and the failure happens before even assigning a temporary filename to that entry, we get an odd output line. This can be seen when trying to check out a symlink whose blob is missing: $ missing_blob=$(git hash-object --stdin </dev/null) $ git update-index --add --cacheinfo 120000,$missing_blob,foo $ git checkout-index --temp foo error: unable to read sha1 file of foo (`e69de29bb2`) foo The 'TAB foo' line is not much useful and it might break scripts that expect the 'tempname TAB foo' output. So let's omit such entries from the stdout list (but leaving the error message on stderr). We could also consider omitting _all_ failed entries from the output list, but that's probably not a good idea as the associated tempfiles may have been created even when checkout failed, so scripts may want to use the output list for cleanup. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 11:27:18 -08:00
Matheus Tavares	9334ea8e92	write_entry(): fix misuses of `path` in error messages The variables `path` and `ce->name`, at write_entry(), usually have the same contents, but that's not the case when using a checkout prefix or writing to a tempfile. (In fact, `path` will be either empty or dirty when writing to a tempfile.) Therefore, these variables cannot be used interchangeably. In this sense, fix wrong uses of `path` in error messages where it should really be `ce->name`, and add some regression tests. (Note: there doesn't seem to be any misuse in the other way around.) Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 11:27:17 -08:00
Jeff King	adcd9f5472	mailmap: do not respect symlinks for in-tree .mailmap As with .gitattributes and .gitignore, we would like to make sure that .mailmap files are handled consistently whether read from the a blob (as is the default behavior in a bare repo) or from the filesystem. Likewise, we would like to avoid reading out-of-tree files pointed to by a symlink, which could have security implications in certain setups. We can cover both by using open_nofollow() when opening the in-tree files. We'll continue to follow links for mailmap.file, as well as when reading .mailmap from the current directory when outside of a repository entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:41:33 -08:00
Jeff King	feb9b7792f	exclude: do not respect symlinks for in-tree .gitignore As with .gitattributes, we would like to make sure that .gitignore files are handled consistently whether read from the index or from the filesystem. Likewise, we would like to avoid reading out-of-tree files pointed to by the symlinks, which could have security implications in certain setups. We can cover both by using open_nofollow() when opening the in-tree files. We'll continue to follow links for core.excludesFile, as well as $GIT_DIR/info/exclude. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:41:33 -08:00
Jeff King	2ef579e261	attr: do not respect symlinks for in-tree .gitattributes The attributes system may sometimes read in-tree files from the filesystem, and sometimes from the index. In the latter case, we do not resolve symbolic links (and are not likely to ever start doing so). Let's open filesystem links with O_NOFOLLOW so that the two cases behave consistently. As a bonus, this means that git will not follow such symlinks to read and parse out-of-tree paths. In some cases this could have security implications, as a malicious repository can cause Git to open and read arbitrary files. It could already feed arbitrary content to the parser, but in certain setups it might be able to exfiltrate data from those paths (e.g., if an automated service operating on the malicious repo reveals its stderr to an attacker). Note that O_NOFOLLOW only prevents following links for the path itself, not intermediate directories in the path. At first glance, it seems like ln -s /some/path in-repo might still look at "in-repo/.gitattributes", following the symlink to "/some/path/.gitattributes". However, if "in-repo" is a symbolic link, then we know that it has no git paths below it, and will never look at its .gitattributes file. We will continue to support out-of-tree symbolic links (e.g., in $GIT_DIR/info/attributes); this just affects in-tree links. When a symbolic link is encountered, the contents are ignored and a warning is printed. POSIX specifies ELOOP in this case, so the user would generally see something like: warning: unable to access '.gitattributes': Too many levels of symbolic links Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:41:33 -08:00
Jeff King	1679d60bfc	exclude: add flags parameter to add_patterns() There are a number of callers of add_patterns() and its sibling functions. Let's give them a "flags" parameter for adding new options without having to touch each caller. We'll use this in a future patch to add O_NOFOLLOW support. But for now each caller just passes 0. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:41:33 -08:00
Jeff King	dbf387d550	attr: convert "macro_ok" into a flags field The attribute code can have a rather deep callstack, through which we have to pass the "macro_ok" flag. In anticipation of adding other flags, let's convert this to a generic bit-field. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:41:32 -08:00
Jeff King	00611d8440	add open_nofollow() helper Some callers of open() would like to use O_NOFOLLOW, but it is not available on all platforms. Let's abstract this into a helper function so we can provide system-specific implementations. Some light web-searching reveals that we might be able to get something similar on Windows using FILE_FLAG_OPEN_REPARSE_POINT. I didn't dig into this further. For other systems without O_NOFOLLOW or any equivalent, we have two options for fallback: - we can just open anyway, following symlinks; this may have security implications (e.g., following untrusted in-tree symlinks) - we can determine whether the path is a symlink with lstat(). This is slower (two syscalls instead of one), but that may be acceptable for infrequent uses like looking up .gitattributes files (especially because we can get away with a single syscall for the common case of ENOENT). It's also racy, but should be sufficient for our needs (we are worried about in-tree symlinks that we ourselves would have previously created). We could make it non-racy at the cost of making it even slower, by doing an fstat() on the opened descriptor and comparing the dev/ino fields to the original lstat(). This patch implements the lstat() option in its slightly-faster racy form. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:41:32 -08:00
Junio C Hamano	1eb4136ac2	diff: --{rotate,skip}-to=<path> In the implementation of "git difftool", there is a case where the user wants to start viewing the diffs at a specific path and continue on to the rest, optionally wrapping around to the beginning. Since it is somewhat cumbersome to implement such a feature as a post-processing step of "git diff" output, let's support it internally with two new options. - "git diff --rotate-to=C", when the resulting patch would show paths A B C D E without the option, would "rotate" the paths to shows patch to C D E A B instead. It is an error when there is no patch for C is shown. - "git diff --skip-to=C" would instead "skip" the paths before C, and shows patch to C D E. Again, it is an error when there is no patch for C is shown. - "git log [-p]" also accepts these two options, but it is not an error if there is no change to the specified path. Instead, the set of output paths are rotated or skipped to the specified path or the first path that sorts after the specified path. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:30:42 -08:00
Elijah Newren	f78cf97617	merge-ort: call diffcore_rename() directly We want to pass additional information to diffcore_rename() (or some variant thereof) without plumbing that extra information through diff_tree_oid() and diffcore_std(). Further, since we will need to gather additional special information related to diffs and are walking the trees anyway in collect_merge_info(), it seems odd to have diff_tree_oid()/diffcore_std() repeat those tree walks. And there may be times where we can avoid traversing into a subtree in collect_merge_info() (based on additional information at our disposal), that the basic diff logic would be unable to take advantage of. For all these reasons, just create the add and delete pairs ourself and then call diffcore_rename() directly. This change is primarily about enabling future optimizations; the advantage of avoiding extra tree traversals is small compared to the cost of rename detection, and the advantage of avoiding the extra tree traversals is somewhat offset by the extra time spent in collect_merge_info() collecting the additional data anyway. However... For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 13.294 s ± 0.103 s 12.775 s ± 0.062 s mega-renames: 187.248 s ± 0.882 s 188.754 s ± 0.284 s just-one-mega: 5.557 s ± 0.017 s 5.599 s ± 0.019 s Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 18:02:16 -08:00
Elijah Newren	07c9a7fcb5	gitdiffcore doc: mention new preliminary step for rename detection The last few patches have introduced a new preliminary step when rename detection is on but both break detection and copy detection are off. Document this new step. While we're at it, add a testcase that checks the new behavior as well. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 18:02:16 -08:00
Elijah Newren	bd24aa2f97	diffcore-rename: guide inexact rename detection based on basenames Make use of the new find_basename_matches() function added in the last two patches, to find renames more rapidly in cases where we can match up files based on basenames. As a quick reminder (see the last two commit messages for more details), this means for example that docs/extensions.txt and docs/config/extensions.txt are considered likely renames if there are no remaining 'extensions.txt' files elsewhere among the added and deleted files, and if a similarity check confirms they are similar, then they are marked as a rename without looking for a better similarity match among other files. This is a behavioral change, as covered in more detail in the previous commit message. We do not use this heuristic together with either break or copy detection. The point of break detection is to say that filename similarity does not imply file content similarity, and we only want to know about file content similarity. The point of copy detection is to use more resources to check for additional similarities, while this is an optimization that uses far less resources but which might also result in finding slightly fewer similarities. So the idea behind this optimization goes against both of those features, and will be turned off for both. For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 13.815 s ± 0.062 s 13.294 s ± 0.103 s mega-renames: 1799.937 s ± 0.493 s 187.248 s ± 0.882 s just-one-mega: 51.289 s ± 0.019 s 5.557 s ± 0.017 s Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 18:02:16 -08:00
Elijah Newren	da09f65127	diffcore-rename: complete find_basename_matches() It is not uncommon in real world repositories for the majority of file renames to not change the basename of the file; i.e. most "renames" are just a move of files into different directories. We can make use of this to avoid comparing all rename source candidates with all rename destination candidates, by first comparing sources to destinations with the same basenames. If two files with the same basename are sufficiently similar, we record the rename; if not, we include those files in the more exhaustive matrix comparison. This means we are adding a set of preliminary additional comparisons, but for each file we only compare it with at most one other file. For example, if there was a include/media/device.h that was deleted and a src/module/media/device.h that was added, and there are no other device.h files in the remaining sets of added and deleted files after exact rename detection, then these two files would be compared in the preliminary step. This commit does not yet actually employ this new optimization, it merely adds a function which can be used for this purpose. The next commit will do the necessary plumbing to make use of it. Note that this optimization might give us different results than without the optimization, because it's possible that despite files with the same basename being sufficiently similar to be considered a rename, there's an even better match between files without the same basename. I think that is okay for four reasons: (1) it's easy to explain to the users what happened if it does ever occur (or even for them to intuitively figure out), (2) as the next patch will show it provides such a large performance boost that it's worth the tradeoff, and (3) it's somewhat unlikely that despite having unique matching basenames that other files serve as better matches. Reason (4) takes a full paragraph to explain... If the previous three reasons aren't enough, consider what rename detection already does. Break detection is not the default, meaning that if files have the same _fullname_, then they are considered related even if they are 0% similar. In fact, in such a case, we don't even bother comparing the files to see if they are similar let alone comparing them to all other files to see what they are most similar to. Basically, we override content similarity based on sufficient filename similarity. Without the filename similarity (currently implemented as an exact match of filename), we swing the pendulum the opposite direction and say that filename similarity is irrelevant and compare a full N x M matrix of sources and destinations to find out which have the most similar contents. This optimization just adds another form of filename similarity comparison, but augments it with a file content similarity check as well. Basically, if two files have the same basename and are sufficiently similar to be considered a rename, mark them as such without comparing the two to all other rename candidates. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 18:02:16 -08:00
Elijah Newren	a35df3371c	diffcore-rename: compute basenames of source and dest candidates We want to make use of unique basenames among remaining source and destination files to help inform rename detection, so that more likely pairings can be checked first. (src/moduleA/foo.txt and source/module/A/foo.txt are likely related if there are no other 'foo.txt' files among the remaining deleted and added files.) Add a new function, not yet used, which creates a map of the unique basenames within rename_src and another within rename_dst, together with the indices within rename_src/rename_dst where those basenames show up. Non-unique basenames still show up in the map, but have an invalid index (-1). This function was inspired by the fact that in real world repositories, files are often moved across directories without changing names. Here are some sample repositories and the percentage of their historical renames (as of early 2020) that preserved basenames: * linux: 76% * gcc: 64% * gecko: 79% * webkit: 89% These statistics alone don't prove that an optimization in this area will help or how much it will help, since there are also unpaired adds and deletes, restrictions on which basenames we consider, etc., but it certainly motivated the idea to try something in this area. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 18:02:16 -08:00
Elijah Newren	f3845257a5	t4001: add a test comparing basename similarity and content similarity Add a simple test where a removed file is similar to two different added files; one of them has the same basename, and the other has a slightly higher content similarity. In the current test, content similarity is weighted higher than filename similarity. Subsequent commits will add a new rule that weighs a mixture of filename similarity and content similarity in a manner that will change the outcome of this testcase. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 18:02:16 -08:00
Elijah Newren	829514c515	diffcore-rename: filter rename_src list when possible We have to look at each entry in rename_src a total of rename_dst_nr times. When we're not detecting copies, any exact renames or ignorable rename paths will just be skipped over. While checking that these can be skipped over is a relatively cheap check, it's still a waste of time to do that check more than once, let alone rename_dst_nr times. When rename_src_nr is a few thousand times bigger than the number of relevant sources (such as when cherry-picking a commit that only touched a handful of files, but from a side of history that has different names for some high level directories), this time can add up. First make an initial pass over the rename_src array and move all the relevant entries to the front, so that we can iterate over just those relevant entries. For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 14.119 s ± 0.101 s 13.815 s ± 0.062 s mega-renames: 1802.044 s ± 0.828 s 1799.937 s ± 0.493 s just-one-mega: 51.391 s ± 0.028 s 51.289 s ± 0.019 s Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 18:02:16 -08:00
Hariom Verma	ee82a487f6	ref-filter: use pretty.c logic for trailers Now, ref-filter is using pretty.c logic for setting trailer options. New to ref-filter: :key=<K> - only show trailers with specified key. :valueonly[=val] - only show the value part. :separator=<SEP> - inserted between trailer lines. :key_value_separator=<SEP> - inserted between key and value in trailer lines Enhancement to existing options(now can take value and its optional): :only[=val] :unfold[=val] 'val' can be: true, on, yes or false, off, no. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 16:48:38 -08:00
Hariom Verma	636a0aeedf	pretty.c: capture invalid trailer argument As we would like to use this trailers logic in the ref-filter, it's nice to get an invalid trailer argument. This will allow us to print precise error message while using `format_set_trailers_options()` in ref-filter. For capturing the invalid argument, we changed the working of `format_set_trailers_options()` a little bit. Original logic does "break" and fell through in mainly 2 cases - 1. unknown/invalid argument 2. end of the arg string But now instead of "break", we capture invalid argument and return non-zero. And non-zero is handled by the caller. (We prepared the caller to handle non-zero in the previous commit). Capturing invalid arguments this way will also affects the working of current logic. As at the end of the arg string it will return non-zero. So in order to make things correct, introduced an additional conditional statement i.e if encounter ")", do 'break'. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 16:48:38 -08:00
Hariom Verma	90563aedca	pretty.c: refactor trailer logic to `format_set_trailers_options()` Refactored trailers formatting logic inside pretty.c to a new function `format_set_trailers_options()`. This new function returns the non-zero in case of unusual. The caller handles the non-zero by "goto trailers_out". This change will allow us to reuse the same logic in other places. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 16:48:38 -08:00
Hariom Verma	727331dce1	t6300: use function to test trailer options Add a function to test trailer options. This will make tests look cleaner, as well as will make it easier to add new tests for trailers in the future. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 16:48:38 -08:00
Junio C Hamano	328c109303	The eighth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-12 14:21:04 -08:00
Junio C Hamano	8b25dee615	Merge branch 'tb/precompose-prefix-too' When commands are started from a subdirectory, they may have to compare the path to the subdirectory (called prefix and found out from $(pwd)) with the tracked paths. On macOS, $(pwd) and readdir() yield decomposed path, while the tracked paths are usually normalized to the precomposed form, causing mismatch. This has been fixed by taking the same approach used to normalize the command line arguments. * tb/precompose-prefix-too: MacOS: precompose_argv_prefix()	2021-02-12 14:21:04 -08:00
Junio C Hamano	006c5f79be	Merge branch 'jk/complete-branch-force-delete' The command line completion (in contrib/) completed "git branch -d" with branch names, but "git branch -D" offered tagnames in addition, which has been corrected. "git branch -M" had the same problem. * jk/complete-branch-force-delete: doc/git-branch: fix awkward wording for "-c" completion: handle other variants of "branch -m" completion: treat "branch -D" the same way as "branch -d"	2021-02-12 14:21:04 -08:00
Junio C Hamano	60f8121940	Merge branch 'jv/upload-pack-filter-spec-quotefix' Fix in passing custom args from "git clone" to "upload-pack" on the other side. * jv/upload-pack-filter-spec-quotefix: t5544: clarify 'hook works with partial clone' test upload-pack.c: fix filter spec quoting bug	2021-02-12 14:21:04 -08:00
Junio C Hamano	3c12d0b885	Merge branch 'tb/pack-revindex-on-disk' Introduce an on-disk file to record revindex for packdata, which traditionally was always created on the fly and only in-core. * tb/pack-revindex-on-disk: t5325: check both on-disk and in-memory reverse index pack-revindex: ensure that on-disk reverse indexes are given precedence t: support GIT_TEST_WRITE_REV_INDEX t: prepare for GIT_TEST_WRITE_REV_INDEX Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' builtin/pack-objects.c: respect 'pack.writeReverseIndex' builtin/index-pack.c: write reverse indexes builtin/index-pack.c: allow stripping arbitrary extensions pack-write.c: prepare to write 'pack-.rev' files packfile: prepare for the existence of '.rev' files	2021-02-12 14:21:04 -08:00
Junio C Hamano	2c873f9791	Merge branch 'ab/tests-various-fixup' Various test updates. * ab/tests-various-fixup: rm tests: actually test for SIGPIPE in SIGPIPE test archive tests: use a cheaper "zipinfo -h" invocation to get header upload-pack tests: avoid a non-zero "grep" exit status git-svn tests: rewrite brittle tests to use "--[no-]merges". git svn mergeinfo tests: refactor "test -z" to use test_must_be_empty git svn mergeinfo tests: modernize redirection & quoting style cache-tree tests: explicitly test HEAD and index differences cache-tree tests: use a sub-shell with less indirection cache-tree tests: remove unused $2 parameter cache-tree tests: refactor for modern test style	2021-02-12 14:21:04 -08:00
Elijah Newren	f15eb7c1cf	diffcore-rename: no point trying to find a match better than exact diffcore_rename() had some code to avoid having destination paths that already had an exact rename detected from being re-checked for other renames. Source paths, however, were re-checked because we wanted to allow the possibility of detecting copies. But if copy detection isn't turned on, then this merely amounts to attempting to find a better-than-exact match, which naturally ends up being an expensive no-op. In particular, copy detection is never turned on by the merge machinery. For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 14.263 s ± 0.053 s 14.119 s ± 0.101 s mega-renames: 5504.231 s ± 5.150 s 1802.044 s ± 0.828 s just-one-mega: 158.534 s ± 0.498 s 51.391 s ± 0.028 s Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-12 12:04:00 -08:00
Ævar Arnfjörð Bjarmason	e7884b353b	test-lib-functions: assert correct parameter count Add assertions of the correct parameter count of various functions, in particularly the wrappers for the shell "test" built-in. In an earlier commit we fixed a bug with an incorrect number of arguments being passed to "test_path_is_{file,missing}". Let's also guard other similar functions from the same sort of misuse. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-12 11:58:21 -08:00
Ævar Arnfjörð Bjarmason	45a2686441	test-lib-functions: remove bug-inducing "diagnostics" helper param Remove the optional "diagnostics" parameter of the test_path_is_{file,dir,missing} functions. We have a lot of uses of these functions, but the only legitimate use of the diagnostics parameter is from when the functions themselves were introduced in `2caf20c52b` (test-lib: user-friendly alternatives to test [-d\|-f\|-e], 2010-08-10). But as the the rest of this diff demonstrates its presence did more to silently introduce bugs in our tests. Fix such bugs in the tests added in `ae4e89e549` (gc: add --keep-largest-pack option, 2018-04-15), and `c04ba51739` (t6046: testcases checking whether updates can be skipped in a merge, 2018-04-19). Let's also assert that those functions are called with exactly one parameter, a follow-up commit will add similar asserts to other functions in test-lib-functions.sh that we didn't have existing misuse of. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-12 11:58:21 -08:00
Ævar Arnfjörð Bjarmason	ebd73f50c6	test libs: rename "diff-lib" to "lib-diff" Rename the "diff-lib" to "lib-diff". With this rename and preceding commits there is no remaining t/lib which doesn't follow the convention of being called t/lib-*. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-12 11:58:21 -08:00
Junio C Hamano	f011795891	Sync with maint	2021-02-11 13:58:52 -08:00
Junio C Hamano	d3a035b055	Merge branch 'en/merge-ort-perf' The "ort" merge strategy. * en/merge-ort-perf: merge-ort: begin performance work; instrument with trace2_region_* calls merge-ort: ignore the directory rename split conflict for now merge-ort: fix massive leak	2021-02-11 13:58:44 -08:00
Junio C Hamano	a21e27ef6b	Merge branch 'en/ort-directory-rename' ORT merge strategy learns to infer "renamed directory" while merging. * en/ort-directory-rename: merge-ort: fix a directory rename detection bug merge-ort: process_renames() now needs more defensiveness merge-ort: implement apply_directory_rename_modifications() merge-ort: add a new toplevel_dir field merge-ort: implement handle_path_level_conflicts() merge-ort: implement check_for_directory_rename() merge-ort: implement apply_dir_rename() and check_dir_renamed() merge-ort: implement compute_collisions() merge-ort: modify collect_renames() for directory rename handling merge-ort: implement handle_directory_level_conflicts() merge-ort: implement compute_rename_counts() merge-ort: copy get_renamed_dir_portion() from merge-recursive.c merge-ort: add outline of get_provisional_directory_renames() merge-ort: add outline for computing directory renames merge-ort: collect which directories are removed in dirs_removed merge-ort: initialize and free new directory rename data structures merge-ort: add new data structures for directory rename detection	2021-02-11 13:58:43 -08:00
Andrew Klotz	f276e2a469	config: improve error message for boolean config Currently invalid boolean config values return messages about 'bad numeric', which is slightly misleading when the error was due to a boolean value. We can improve the developer experience by returning a boolean error message when we know the value is neither a bool text or int. before with an invalid boolean value of `non-boolean`, its unclear what numeric is referring to: fatal: bad numeric config value 'non-boolean' for 'commit.gpgsign': invalid unit now the error message mentions `non-boolean` is a bad boolean value: fatal: bad boolean config value 'non-boolean' for 'commit.gpgsign' Signed-off-by: Andrew Klotz <agc.klotz@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:44:55 -08:00
Shubham Verma	488acf15df	t7001: use `test` rather than `[` According to Documentation/CodingGuidelines, we should use "test" rather than "[ ... ]" in shell scripts, so let's replace the "[ ... ]" with "test" in the t7001 test script. Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:17 -08:00
Shubham Verma	39252c833e	t7001: use here-docs instead of echo Change from old style to current style by taking advantage of here-docs instead of echo commands. Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:16 -08:00
Shubham Verma	5d683c3f4b	t7001: put each command on a separate line Modern practice is to avoid multiple commands per line, and instead place each command on its own line. Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:16 -08:00
Shubham Verma	d2ecddc981	t7001: use '>' rather than 'touch' Use `>` rather than `touch` to create an empty file when the timestamp isn't relevant to the test. Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:16 -08:00
Shubham Verma	368d278249	t7001: avoid using `cd` outside of subshells Avoid using `cd` outside of subshells since, if the test fails, there is no guarantee that the current working directory is the expected one, which may cause subsequent tests to run in the wrong directory. While at it, make some other tests more concise by replacing simple subshells with `git -C`. Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:16 -08:00
Shubham Verma	dd72154149	t7001: remove whitespace after redirect operators According to Documentation/CodingGuidelines, there should be no whitespace after redirect operators. So, we should remove these whitespaces after redirect operators. Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:16 -08:00
Shubham Verma	9bcaeb71a6	t7001: modernize subshell formatting Some test use an old style for formatting subshells: (command && ... Update them to the modern style: ( command && ... Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:16 -08:00
Shubham Verma	9b46e9c9cc	t7001: remove unnecessary blank lines Some tests use a deprecated style in which there are unnecessary blank lines after the opening quote of the test body and before the closing quote. So we should remove these unnecessary blank lines. Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:16 -08:00
Shubham Verma	a76d90670a	t7001: indent with TABs instead of spaces Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:16 -08:00
Shubham Verma	5712d62ccf	t7001: modernize test formatting Some tests in this script are formatted using a very old style: test_expect_success \ 'title' \ 'body line 1 && body line 2' Update the formatting to the modern style: test_expect_success 'title' ' body line 1 && body line 2 ' Signed-off-by: Shubham Verma <shubhunic@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:42:16 -08:00
Denton Liu	3e885f0277	stash: declare ref_stash as an array Save sizeof(const char *) bytes by declaring ref_stash as an array instead of having a redundant pointer to an array. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:34:58 -08:00
Denton Liu	8c2462d1fe	t3905: use test_cmp() to check file contents Modernize the script by doing file content comparisons using test_cmp() instead of `test x = "$(cat file)"`. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:34:58 -08:00
Denton Liu	27e25a8cbf	t3905: replace test -s with test_file_not_empty In order to modernize the test script, replace `test -s` with test_file_not_empty(), which provides better diagnostic output in the case of failure. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:34:58 -08:00
Denton Liu	389ece4022	t3905: remove nested git in command substitution If a git command in a nested command substitution fails, it will be silently ignored since only the return code of the outer command substitutions is reported. Factor out nested command substitutions so that the error codes of those commands are reported. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:34:58 -08:00
Denton Liu	bbaa45c3aa	t3905: move all commands into test cases In order to modernize the tests, move commands that currently run outside of test cases into a test case. Where possible, clean up files that are produced using test_when_finished() but in the case where files persist over multiple test cases, create a new test case to perform cleanup. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:34:58 -08:00
Denton Liu	32b7385e43	t3905: remove spaces after redirect operators For shell scripts, the usual convention is for there to be no space after redirection operators, (e.g. `>file`, not `> file`). Remove these spaces wherever they appear. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:34:58 -08:00
Denton Liu	d6ab8b1929	git-stash.txt: be explicit about subcommand options Currently, the options for the `list` and `show` subcommands are just listed as `<options>`. This seems to imply, from a cursory glance at the summary, that they take the stash options listed below. However, reading more carefully, we see that they take log options and diff options respectively. Make it more obvious that they take log and diff options by explicitly stating this in the subcommand summary. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:34:58 -08:00
Jeff King	16950f8384	rev-list: add --disk-usage option for calculating disk usage It can sometimes be useful to see which refs are contributing to the overall repository size (e.g., does some branch have a bunch of objects not found elsewhere in history, which indicates that deleting it would shrink the size of a clone). You can find that out by generating a list of objects, getting their sizes from cat-file, and then summing them, like: git rev-list --objects --no-object-names main..branch git cat-file --batch-check='%(objectsize:disk)' \| perl -lne '$total += $_; END { print $total }' Though note that the caveats from git-cat-file(1) apply here. We "blame" base objects more than their deltas, even though the relationship could easily be flipped. Still, it can be a useful rough measure. But one problem is that it's slow to run. Teaching rev-list to sum up the sizes can be much faster for two reasons: 1. It skips all of the piping of object names and sizes. 2. If bitmaps are in use, for objects that are in the bitmapped packfile we can skip the oid_object_info() lookup entirely, and just ask the revindex for the on-disk size. This patch implements a --disk-usage option which produces the same answer in a fraction of the time. Here are some timings using a clone of torvalds/linux: [rev-list piped to cat-file, no bitmaps] $ time git rev-list --objects --no-object-names --all \| git cat-file --buffer --batch-check='%(objectsize:disk)' \| perl -lne '$total += $_; END { print $total }' 1459938510 real 0m29.635s user 0m38.003s sys 0m1.093s [internal, no bitmaps] $ time git rev-list --disk-usage --objects --all 1459938510 real 0m31.262s user 0m30.885s sys 0m0.376s Even though the wall-clock time is slightly worse due to parallelism, notice the CPU savings between the two. We saved 21% of the CPU just by avoiding the pipes. But the real win is with bitmaps. If we use them without the new option: [rev-list piped to cat-file, bitmaps] $ time git rev-list --objects --no-object-names --all --use-bitmap-index \| git cat-file --batch-check='%(objectsize:disk)' \| perl -lne '$total += $_; END { print $total }' 1459938510 real 0m6.244s user 0m8.452s sys 0m0.311s then we're faster to generate the list of objects, but we still spend a lot of time piping and looking things up. But if we do both together: [internal, bitmaps] $ time git rev-list --disk-usage --objects --all --use-bitmap-index 1459938510 real 0m0.219s user 0m0.169s sys 0m0.049s then we get the same answer much faster. For "--all", that answer will correspond closely to "du objects/pack", of course. But we're actually checking reachability here, so we're still fast when we ask for more interesting things: $ time git rev-list --disk-usage --use-bitmap-index v5.0..v5.10 374798628 real 0m0.429s user 0m0.356s sys 0m0.072s Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 09:57:55 -08:00
Johannes Schindelin	c85eec7fc3	commit-graph: when incompatible with graphs, indicate why When `gc.writeCommitGraph = true`, it is possible that the commit-graph is _still_ not written: replace objects, grafts and shallow repositories are incompatible with the commit-graph feature. Under such circumstances, we need to indicate to the user why the commit-graph was not written instead of staying silent about it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 09:33:01 -08:00
Johannes Schindelin	c809798b2a	reflog expire --stale-fix: be generous about missing objects Whenever a user runs `git reflog expire --stale-fix`, the most likely reason is that their repository is at least _somewhat_ corrupt. Which means that it is more than just possible that some objects are missing. If that is the case, that can currently let the command abort through the phase where it tries to mark all reachable objects. Instead of adding insult to injury, let's be gentle and continue as best as we can in such a scenario, simply by ignoring the missing objects and moving on. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 09:21:52 -08:00
Ævar Arnfjörð Bjarmason	c45dc9cf30	diff: plug memory leak from regcomp() on {log,diff} -I Fix a memory leak in `296d4a94e7` (diff: add -I<regex> that ignores matching changes, 2020-10-20) by freeing the memory it allocates in the newly introduced diff_free(). See the previous commit for details on that. This memory leak was intentionally introduced in `296d4a94e7`, see the discussion on a previous iteration of it in https://lore.kernel.org/git/xmqqeelycajx.fsf@gitster.c.googlers.com/ At that time freeing the memory was somewhat tedious, but since it isn't anymore with the newly introduced diff_free() let's use it. Let's retain the pattern for diff_free_file() and add a diff_free_ignore_regex(), even though (unlike "diff_free_file") we don't need to call it elsewhere. I think this'll make for more readable code than gradually accumulating a giant diff_free() function, sharing "int i" across unrelated code etc. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 09:21:07 -08:00
Ævar Arnfjörð Bjarmason	e900d494dc	diff: add an API for deferred freeing Add a diff_free() function to free anything we may have allocated in the "diff_options" struct, and the ability to make calling it a noop by setting "no_free" in "diff_options". This is required because when e.g. "git diff" is run we'll allocate things in that struct, use the diff machinery once, and then exit. But if we run e.g. "git log -p" we're going to re-use what we allocated across multiple diff_flush() calls, and only want to free things at the end. We've thus ended up with features like the recently added "diff -I"[1] where we'll leak memory. As it turns out it could have simply used the pattern established in `6ea57703f6` (log: prepare log/log-tree to reuse the diffopt.close_file attribute, 2016-06-22). Manually adding more such flags to things log_tree_commit() every time we need to allocate something would be tedious. Let's instead move that fclose() code it to a new diff_free(), in anticipation of freeing more things in that function in follow-up commits. Some functions such as log_tree_commit() need an idiom of optionally retaining a previous "no_free", as they may either free the memory themselves, or their caller may do so. I'm keeping that idiom in log_show_early() for good measure, even though I don't think it's currently called in this manner. It also gets passed an existing "struct rev_info", so future callers may want to set the "no_free" flag. This change is a bit hard to read because while the freeing pattern we're introducing isn't unusual, the "file" member is a special snowflake. We usually don't want to fclose() it. This is because "file" is usually stdout, in which case we don't want to fclose() it. We only want to opt-in to closing it when we e.g. open a file on the filesystem. Thus the opt-in "close_file" flag. So the API in general just needs a "no_free" flag to defer freeing, but the "file" member still needs its "close_file" flag. This is made more confusing because while refactoring this code we could replace some "close_file=0" with "no_free=1", whereas others need to set both flags. This is because there were some cases where an existing "close_file=0" meant "let's defer deallocation", and others where it meant "we don't want to close this file handle at all". 1. `296d4a94e7` (diff: add -I<regex> that ignores matching changes, 2020-10-20) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 09:21:05 -08:00
Ævar Arnfjörð Bjarmason	1108cea7f8	tests: remove most uses of test_i18ncmp As a follow-up to `d162b25f95` (tests: remove support for GIT_TEST_GETTEXT_POISON, 2021-01-20) remove most uses of test_i18ncmp via a simple s/test_i18ncmp/test_cmp/g search-replacement. I'm leaving t6300-for-each-ref.sh out due to a conflict with in-flight changes between "master" and "seen", as well as the prerequisite itself due to other changes between "master" and "next/seen" which add new test_i18ncmp uses. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:48:27 -08:00
Ævar Arnfjörð Bjarmason	b1e079807b	tests: remove last uses of C_LOCALE_OUTPUT Remove the last uses of the C_LOCALE_OUTPUT prerequisite as well as the prerequisite itself. This is a follow-up to `d162b25f95` (tests: remove support for GIT_TEST_GETTEXT_POISON, 2021-01-20), as well as the preceding commit where we removed the simpler uses of C_LOCALE_OUTPUT. Here I'm slightly refactoring a test added in `21e5ad50fc` (safecrlf: Add mechanism to warn about irreversible crlf conversions, 2008-02-06), as well as getting rid of another "test_have_prereq C_LOCALE_OUTPUT" use. I'm not leaving the prerequisite itself in place for in-flight changes as there currently are none that introduce new tests that rely on it, and because C_LOCALE_OUTPUT is currently a noop on the master branch we likely won't have any new submissions that use it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:48:27 -08:00
Ævar Arnfjörð Bjarmason	a926c4b904	tests: remove most uses of C_LOCALE_OUTPUT As a follow-up to `d162b25f95` (tests: remove support for GIT_TEST_GETTEXT_POISON, 2021-01-20) remove those uses of the now always true C_LOCALE_OUTPUT prerequisite from those tests which declare it as an argument to test_expect_{success,failure}. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:48:26 -08:00
Ævar Arnfjörð Bjarmason	780aa0a21e	tests: remove last uses of GIT_TEST_GETTEXT_POISON=false Follow-up my `73c01d25fe` (tests: remove uses of GIT_TEST_GETTEXT_POISON=false, 2021-01-20) by removing the last uses of GIT_TEST_GETTEXT_POISON=*. These assignments were part of branch that was in-flight at the time of the gettext poison removal. See `466f94ec45` (Merge branch 'ab/detox-gettext-tests', 2021-02-10) and `c7d6d419b0` (Merge branch 'ab/mktag', 2021-01-25) for the merging of the two branches. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:48:26 -08:00
Martin von Zweigbergk	fa9ab027ba	docs: clarify that refs/notes/ do not keep the attached objects alive `git help gc` contains this snippet: "[...] it will keep [..] objects referenced by the index, remote-tracking branches, notes saved by git notes under refs/notes/" I had interpreted that as saying that the objects that notes were attached to are kept, but that is not the case. Let's clarify the documentation by moving out the part about git notes to a separate sentence. Signed-off-by: Martin von Zweigbergk <martinvonz@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:43:55 -08:00
brian m. carlson	9b27b49240	gpg-interface: remove other signature headers before verifying When we have a multiply signed commit, we need to remove the signature in the header before verifying the object, since the trailing signature will not be over both pieces of data. Do so, and verify that we validate the signature appropriately. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:35:42 -08:00
brian m. carlson	88bce0e24c	ref-filter: hoist signature parsing When we parse a signature in the ref-filter code, we continually increment the buffer pointer. Hoist the signature parsing above the blank line delimiting headers and body so we can find the signature when using a header to sign the buffer. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:35:42 -08:00
brian m. carlson	937032e14a	commit: allow parsing arbitrary buffers with headers Currently only commits are signed with headers. However, in the future, we'll also sign tags with headers as well. Let's refactor out a function called parse_buffer_signed_by_header which does exactly that. In addition, since we'll want to sign things other than commits this way, let's call the function sign_with_header instead of do_sign_commit. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:35:42 -08:00
brian m. carlson	482c119186	gpg-interface: improve interface for parsing tags We have a function which parses a buffer with a signature at the end, parse_signature, and this function is used for signed tags. However, we'll need to store values for multiple algorithms, and we'll do this by using a header for the non-default algorithm. Adjust the parse_signature interface to store the parsed data in two strbufs and turn the existing function into parse_signed_buffer. The latter is still used in places where we know we always have a signed buffer, such as push certs. Adjust all the callers to deal with this new interface. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:35:42 -08:00
Junio C Hamano	c6102b7585	Merge branch 'tb/ci-run-cocci-with-18.04' The version of Ubuntu Linux used by default at GitHub Actions CI has been updated to one that lack coccinelle; until it gets fixed, work it around by sticking to the previous release (18.04). * tb/ci-run-cocci-with-18.04: .github/workflows/main.yml: run static-analysis on bionic	2021-02-10 16:48:07 -08:00
Junio C Hamano	f9f2520108	The seventh batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 14:48:33 -08:00
Junio C Hamano	466f94ec45	Merge branch 'ab/detox-gettext-tests' Get rid of "GETTEXT_POISON" support altogether, which may or may not be controversial. * ab/detox-gettext-tests: tests: remove uses of GIT_TEST_GETTEXT_POISON=false tests: remove support for GIT_TEST_GETTEXT_POISON ci: remove GETTEXT_POISON jobs	2021-02-10 14:48:33 -08:00
Junio C Hamano	59ace284f3	Merge branch 'ab/grep-pcre-invalid-utf8' Update support for invalid UTF-8 in PCRE2. * ab/grep-pcre-invalid-utf8: grep/pcre2: better support invalid UTF-8 haystacks grep/pcre2 tests: don't rely on invalid UTF-8 data test	2021-02-10 14:48:33 -08:00
Junio C Hamano	0199c68d01	Merge branch 'ab/retire-pcre1' The support for deprecated PCRE1 library has been dropped. * ab/retire-pcre1: Remove support for v1 of the PCRE library config.mak.uname: remove redundant NO_LIBPCRE1_JIT flag	2021-02-10 14:48:33 -08:00
Junio C Hamano	938ecaa42f	Merge branch 'jk/pretty-lazy-load-commit' Some pretty-format specifiers do not need the data in commit object (e.g. "%H"), but we were over-eager to load and parse it, which has been made even lazier. * jk/pretty-lazy-load-commit: pretty: lazy-load commit data when expanding user-format	2021-02-10 14:48:33 -08:00
Junio C Hamano	2f794620f5	Merge branch 'ds/more-index-cleanups' Cleaning various codepaths up. * ds/more-index-cleanups: t1092: test interesting sparse-checkout scenarios test-lib: test_region looks for trace2 regions sparse-checkout: load sparse-checkout patterns name-hash: use trace2 regions for init repository: add repo reference to index_state fsmonitor: de-duplicate BUG()s around dirty bits cache-tree: extract subtree_pos() cache-tree: simplify verify_cache() prototype cache-tree: clean up cache_tree_update()	2021-02-10 14:48:33 -08:00
Junio C Hamano	02fb21617e	Merge branch 'rs/worktree-list-verbose' `git worktree list` now annotates worktrees as prunable, shows locked and prunable attributes in --porcelain mode, and gained a --verbose option. * rs/worktree-list-verbose: worktree: teach `list` verbose mode worktree: teach `list` to annotate prunable worktree worktree: teach `list --porcelain` to annotate locked worktree t2402: ensure locked worktree is properly cleaned up worktree: teach worktree_lock_reason() to gently handle main worktree worktree: teach worktree to lazy-load "prunable" reason worktree: libify should_prune_worktree()	2021-02-10 14:48:32 -08:00
Junio C Hamano	7e94720c1e	Merge branch 'js/rebase-i-commit-cleanup-fix' When "git rebase -i" processes "fixup" insn, there is no reason to clean up the commit log message, but we did the usual stripspace processing. This has been corrected. * js/rebase-i-commit-cleanup-fix: rebase -i: do leave commit message intact in fixup! chains	2021-02-10 14:48:32 -08:00
Junio C Hamano	e5abed92f5	Merge branch 'jk/t0000-cleanups' Code clean-up. * jk/t0000-cleanups: t0000: consistently use single quotes for outer tests t0000: run cleaning test inside sub-test t0000: run prereq tests inside sub-test t0000: keep clean-up tests together	2021-02-10 14:48:32 -08:00
Junio C Hamano	04703f64be	Merge branch 'sg/t7800-difftool-robustify' Test fix. * sg/t7800-difftool-robustify: t7800-difftool: don't accidentally match tmp dirs	2021-02-10 14:48:32 -08:00
Junio C Hamano	c9f94ab4fa	Merge branch 'ab/lose-grep-debug' Lose the debugging aid that may have been useful in the past, but no longer is, in the "grep" codepaths. * ab/lose-grep-debug: grep/log: remove hidden --debug and --grep-debug options	2021-02-10 14:48:31 -08:00
Junio C Hamano	9d5b1c06ac	Merge branch 'jk/use-oid-pos' Code clean-up to ensure our use of hashtables using object names as keys use the "struct object_id" objects, not the raw hash values. * jk/use-oid-pos: oid_pos(): access table through const pointers hash_pos(): convert to oid_pos() rerere: use strmap to store rerere directories rerere: tighten rr-cache dirname check rerere: check dirname format while iterating rr_cache directory commit_graft_pos(): take an oid instead of a bare hash	2021-02-10 14:48:31 -08:00
Eric Wong	a5cdca4520	t1500: ensure current --since= behavior remains This behavior of git-rev-parse is observed since git 1.8.3.1 at least(), and likely earlier versions. At least one git-reliant project in-the-wild relies on this current behavior of git-rev-parse being able to handle multiple --since= arguments without squeezing identical results together. So add a test to prevent the potential for regression in downstream projects. () 1.8.3.1 the version packaged for CentOS 7.x Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 14:24:13 -08:00
Charvi Mendiratta	fa153c1cd7	doc/rebase -i: fix typo in the documentation of 'fixup' command Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:58:19 -08:00
Charvi Mendiratta	9ff6b74bb7	t/t3437: fixup the test 'multiple fixup -c opens editor once' In the test, FAKE_COMMIT_MESSAGE replaces the commit message each time it is invoked so there will be only one instance of "Modified-A3" no matter how many times we invoke the editor. Let's fix this and use FAKE_COMMIT_AMEND instead so that it adds "Modified-A3" once for each time the editor is invoked. This patch also removes the check for counting the number of "Modified-A3" lines and instead compares the whole message to check that the commenting code works correctly for 'fixup -c' as well as 'fixup -C'. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:58:19 -08:00
Charvi Mendiratta	9c7650c45c	t/t3437: use named commits in the tests Use the named commits in the tests so that they will still refer to the same commit if the setup gets changed in the future whereas 'branch~2' will change which commit it points to. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:58:19 -08:00
Charvi Mendiratta	d8bd08066d	t/t3437: simplify and document the test helpers Let's simplify the test_commit_message() helper function and add comments to the function. This patch also document the working of 'fixup -C' with "amend!" in the test-description. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:58:19 -08:00
Charvi Mendiratta	4755fed0a6	t/t3437: check the author date of fixed up commit Add '%at' format in the get_author() function and update the test to check that the author date of the fixed up commit is unchanged. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:58:19 -08:00
Charvi Mendiratta	733ad2e15a	t/t3437: remove the dependency of 'expected-message' file from tests As it is currently implemented, it's too difficult to follow along and remember the value of "expected-message" from test to test. It also makes it difficult to extend tests or add new tests in between existing tests without negatively impacting other tests. Let's set up "expected-message" to the precise content needed by the test, so that both the problems go away and also makes easier to run tests selectively with '--run' or 'GIT_SKIP_TESTS' Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:58:19 -08:00
Charvi Mendiratta	17665167bb	t/t3437: fixup here-docs in the 'setup' test The most common way to format here-docs in Git test scripts is for the body and EOF to be indented the same amount as the command which opened the here-doc. Fix a few here-docs in this script to conform to that standard and also remove the unnecessary curly braces. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:58:19 -08:00
Charvi Mendiratta	75ace8329c	t/lib-rebase: update the documentation of FAKE_LINES FAKE_LINES helper function use underscore to embed a space in a single command. Let's document it and also update the list of commands. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:58:19 -08:00
Charvi Mendiratta	f07871d302	rebase -i: clarify and fix 'fixup -c' rebase-todo help When `-c` says "edit the commit message" it's not clear what will be edited. The original's commit message or the replacement's message or a combination of the two. Word it such that it states more precisely what exactly will be edited. While at it, also drop the jarring period and capitalization, neither of which is otherwise present in the message. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:58:19 -08:00
Ævar Arnfjörð Bjarmason	59934417ff	t/.gitattributes: sort lines Sort the lines starting with "/", the only out-of-place line was added along with most of the file in `614f4f0f35` (Fix the remaining tests that failed with core.autocrlf=true, 2017-05-09). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:54:34 -08:00
Ævar Arnfjörð Bjarmason	ddfe900612	test-lib-functions: move function to lib-bitmap.sh Move a function added to test-lib-functions.sh in `ea047a8eb4` (t5310: factor out bitmap traversal comparison, 2020-02-14) into a new lib-bitmap.sh. The test-lib-functions.sh file should be for functions that are widely used across the test suite, if something's only used by a few tests it makes more sense to have it in a lib-*.sh file. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:54:34 -08:00
Ævar Arnfjörð Bjarmason	3fca1fc651	test libs: rename gitweb-lib.sh to lib-gitweb.sh Rename gitweb-lib.sh to lib-gitweb.sh for consistency with other test library files. When it was introduced in `05526071cb` (gitweb: split test suite into library and tests, 2009-08-27) this naming pattern was more common. Since then all but one other such library which didn't start with "lib-.sh" such as t6000lib.sh has been been renamed, see e.g. `9d488eb40e` (Move t6000lib.sh to lib-, 2010-05-07). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:54:34 -08:00
Ævar Arnfjörð Bjarmason	e8a8e7ff98	test libs: rename bundle helper to "lib-bundle.sh" Rename the recently introduced test-bundle-functions.sh to be consistent with other lib-.sh files, which is the convention for these sorts of shared test library functions. The new test-bundle-functions.sh was introduced in `9901164d81` (test: add helper functions for git-bundle, 2021-01-11). It was the only test-.sh of this nature. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:54:34 -08:00
Ævar Arnfjörð Bjarmason	f3ad2bf471	test-lib-functions: remove generate_zero_bytes() wrapper Since `d5cfd142ec` (tests: teach the test-tool to generate NUL bytes and use it, 2019-02-14) the generate_zero_bytes() functions has been a thin wrapper for "test-tool genzeros". Let's have its only user call that directly instead. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:54:34 -08:00
Ævar Arnfjörð Bjarmason	762ccf9906	test-lib-functions: move test_set_index_version() to its user Move the test_set_index_version() function to its only user. This function has only been used in one place since its addition in `5d9fc888b4` (test-lib: allow setting the index format version, 2014-02-23). Let's have that test script define it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:54:34 -08:00
Ævar Arnfjörð Bjarmason	9e9c7dd6f1	test lib: change "error" to "BUG" as appropriate Change two uses of "error" in test-lib-functions.sh to "BUG". In the first instance in "test_cmp_rev" the author of the "BUG" function added in [1] had another in-flight patch adding this in [2], and the two were never consolidated. In the second case in "test_atexit" added in [3] that we could have instead used "BUG" appears to have been missed. 1. `165293af3c` (tests: send "bug in the test script" errors to the script's stderr, 2018-11-19) 2. `30d0b6dccb` (test-lib-functions: make 'test_cmp_rev' more informative on failure, 2018-11-19) 3. `900721e15c` (test-lib: introduce 'test_atexit', 2019-03-13) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:54:34 -08:00
Ævar Arnfjörð Bjarmason	c0eedbc009	test-lib: remove check_var_migration Remove the check_var_migration() migration helper. This was added back in [1], [2] and [3] to warn users to migrate from e.g. the "GIT_FSMONITOR_TEST" name to "GIT_TEST_FSMONITOR". I daresay that having been warning about this since late 2018 (or v2.20.0) was sufficient time to give everyone interested a heads-up about moving to the new names. I don't see the need for going through the "do this later" codepath anticipated in [1], let's just remove this instead. 1. `4cb54d0aa8` (fsmonitor: update GIT_TEST_FSMONITOR support, 2018-09-18) 2. `1f357b045b` (read-cache: update TEST_GIT_INDEX_VERSION support, 2018-09-18) 3. `5765d97b71` (preload-index: update GIT_FORCE_PRELOAD_TEST support, 2018-09-18) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:54:34 -08:00
Jeff King	a38cb9878a	mailmap: only look for .mailmap in work tree When trying to find a .mailmap file, we will always look for it in the current directory. This makes sense in a repository with a working tree, since we'd always go to the toplevel directory at startup. But for a bare repository, it can be confusing. With an option like --git-dir (or $GIT_DIR in the environment), we don't chdir at all, and we'd read .mailmap from whatever directory you happened to be in before starting Git. (Note that --git-dir without specifying a working tree historically means "the current directory is the root of the working tree", but most bare repositories will have core.bare set these days, meaning they will realize there is no working tree at all). The documentation for gitmailmap(5) says: If the file `.mailmap` exists at the toplevel of the repository[...] which likewise reinforces the notion that we are looking in the working tree. This patch prevents us from looking for such a file when we're in a bare repository. This does break something that used to work: cd bare.git git cat-file blob HEAD:.mailmap >.mailmap git shortlog But that was never advertised in the documentation. And these days we have mailmap.blob (which defaults to HEAD:.mailmap) to do the same thing in a much cleaner way. However, there's one more interesting case: we might not have a repository at all! The git-shortlog command can be run with git-log output fed on its stdin, and it will apply the mailmap. In that case, it probably does make sense to read .mailmap from the current directory. This patch will continue to do so. That leads to one even weirder case: if you run git-shortlog to process stdin, the input _could_ be from a different repository entirely. Should we respect the in-tree .mailmap then? Probably yes. Whatever the source of the input, if shortlog is running in a repository, the documentation claims that we'd read the .mailmap from its top-level (and of course it's reasonably likely that it _is_ from the same repo, and the user just preferred to run git-log and git-shortlog separately for whatever reason). The included test covers these cases, and we now document the "no repo" case explicitly. We also add a test that confirms we find a top-level ".mailmap" even when we start in a subdirectory of the working tree. This worked both before and after this commit, but we never tested it explicitly (it works because we always chdir to the top-level of the working tree if there is one). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 13:34:51 -08:00
Johannes Schindelin	e89f89361c	fsck --name-objects: be more careful parsing generation numbers In `7b35efd734` (fsck_walk(): optionally name objects on the go, 2016-07-17), the `fsck` machinery learned to optionally name the objects, so that it is easier to see what part of the repository is in a bad shape, say, when objects are missing. To save on complexity, this machinery uses a parser to determine the name of a parent given a commit's name: any `~<n>` suffix is parsed and the parent's name is formed from the prefix together with `~<n+1>`. However, this parser has a bug: if it finds a suffix `<n>` that is _not_ `~<n>`, it will mistake the empty string for the prefix and `<n>` for the generation number. In other words, it will generate a name of the form `~<bogus-number>`. Let's fix this. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 12:38:05 -08:00
Johannes Schindelin	8c891eed3a	t1450: robustify `remove_object()` This function can be simplified by using the `test_oid_to_path()` helper, which incidentally also makes it more robust by not relying on the exact file system layout of the loose object files. While at it, do not define those functions in a test case, it buys us nothing. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 12:38:00 -08:00
Matheus Tavares	42d906bec4	grep: honor sparse-checkout on working tree searches On a sparse checked out repository, `git grep` (without --cached) ends up searching the cache when an entry matches the search pathspec and has the SKIP_WORKTREE bit set. This is confusing both because the sparse paths are not expected to be in a working tree search (as they are not checked out), and because the output mixes working tree and cache results without distinguishing them. (Note that grep also resorts to the cache on working tree searches that include --assume-unchanged paths. But the whole point in that case is to assume that the contents of the index entry and the file are the same. This does not apply to the case of sparse paths, where the file isn't even expected to be present.) Fix that by teaching grep to honor the sparse-checkout rules for working tree searches. If the user wants to grep paths outside the current sparse-checkout definition, they may either update the sparsity rules to materialize the files, or use --cached to search all blobs registered in the index. Note: it might also be interesting to add a configuration option that allow users to search paths that are present despite having the SKIP_WORKTREE bit set, and/or to restrict searches in the index and past revisions too. These ideas are left as future improvements to avoid conflicting with other sparse-checkout topics currently in flight. Suggested-by: Elijah Newren <newren@gmail.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 23:10:51 -08:00
Derrick Stolee	acc1c4d5d4	maintenance: incremental strategy runs pack-refs weekly When the 'maintenance.strategy' config option is set to 'incremental', a default maintenance schedule is enabled. Add the 'pack-refs' task to that strategy at the weekly cadence. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 23:09:29 -08:00
Derrick Stolee	41abfe15d9	maintenance: add pack-refs task It is valuable to collect loose refs into a more compressed form. This is typically the packed-refs file, although this could be the reftable in the future. Having packed refs can be extremely valuable in repos with many tags or remote branches that are not modified by the local user, but still are necessary for other queries. For instance, with many exploded refs, commands such as git describe --tags --exact-match HEAD can be very slow (multiple seconds). This command in particular is used by terminal prompts to show when a detatched HEAD is pointing to an existing tag, so having it be slow causes significant delays for users. Add a new 'pack-refs' maintenance task. It runs 'git pack-refs --all --prune' to move loose refs into a packed form. For now, that is the packed-refs file, but could adjust to other file formats in the future. This is the first of several sub-tasks of the 'gc' task that could be extracted to their own tasks. In this process, we should not change the behavior of the 'gc' task since that remains the default way to keep repositories maintained. Creating a new task for one of these sub-tasks only provides more customization options for those choosing to not use the 'gc' task. It is certainly possible to have both the 'gc' and 'pack-refs' tasks enabled and run regularly. While they may repeat effort, they do not conflict in a destructive way. The 'auto_condition' function pointer is left NULL for now. We could extend this in the future to have a condition check if pack-refs should be run during 'git maintenance run --auto'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 23:09:24 -08:00
Jonathan Tan	0a9dde4a04	usage: trace2 BUG() invocations die() messages are traced in trace2, but BUG() messages are not. Anyone tracking die() messages would have even more reason to track BUG(). Therefore, write to trace2 when BUG() is invoked. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 14:14:34 -08:00
Seth House	9d9cf23031	mergetool: add per-tool support and overrides for the hideResolved flag Add a per-tool override flag so that users may enable the flag for one tool and disable it for another by setting `mergetool.<tool>.hideResolved` to `false`. In addition, the author or maintainer of a mergetool may optionally override the default `hideResolved` value for that mergetool. If the `mergetools/<tool>` shell script contains a `hide_resolved_enabled` function it will be called when the mergetool is invoked and the return value will be used as the default for the `hideResolved` flag. hide_resolved_enabled () { return 1 } Disabling may be desirable if the mergetool wants or needs access to the original, unmodified 'LOCAL' and 'REMOTE' versions of the conflicted file. For example: - A tool may use a custom conflict resolution algorithm and prefer to ignore the results of Git's conflict resolution. - A tool may want to visually compare/constrast the version of the file from before the merge (saved to 'LOCAL', 'REMOTE', and 'BASE') with Git's conflict resolution results (saved to 'MERGED'). Helped-by: Johannes Sixt <j6t@kdbg.org> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Seth House <seth@eseth.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 14:09:16 -08:00
Seth House	de8dafbada	mergetool: break setup_tool out into separate initialization function This is preparation for the following commit where we need to source the mergetool shell script to look for overrides before `run_merge_tool` is called. Previously `run_merge_tool` both sourced that script and invoked the mergetool. In the case of the following commit, we need the result of the `hide_resolved` override, if present, before we actually run `run_merge_tool`. The new `initialize_merge_tool` wrapper is exposed and documented as a public interface for consistency with the existing `run_merge_tool` which is also public. Although `setup_tool` could instead be exposed directly, the related `setup_user_tool` would probably also want to be elevated to match and this felt the cleanest to me. Signed-off-by: Seth House <seth@eseth.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 14:09:16 -08:00
Seth House	98ea309b3f	mergetool: add hideResolved configuration The purpose of a mergetool is to help the user resolve any conflicts that Git cannot automatically resolve. If there is a conflict that must be resolved manually Git will write a file named MERGED which contains everything Git was able to resolve by itself and also everything that it was not able to resolve wrapped in conflict markers. One way to think of MERGED is as a two- or three-way diff. If each "side" of the conflict markers is separately extracted an external tool can represent those conflicts as a side-by-side diff. However many mergetools instead diff LOCAL and REMOTE both of which contain versions of the file from before the merge. Since the conflicts Git resolved automatically are not present it forces the user to manually re-resolve those conflicts. Some mergetools also show MERGED but often only for reference and not as the focal point to resolve the conflicts. This adds a `mergetool.hideResolved` flag that will overwrite LOCAL and REMOTE with each corresponding "side" of a conflicted file and thus hide all conflicts that Git was able to resolve itself. Overwriting these files will immediately benefit any mergetool that uses them without requiring any changes to the tool. No adverse effects were noted in a small survey of popular mergetools[1] so this behavior defaults to `true`. However it can be globally disabled by setting `mergetool.hideResolved` to `false`. [1] https://www.eseth.org/2020/mergetools.html `c884424769/2020/mergetools.md` Original-implementation-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Seth House <seth@eseth.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 14:09:16 -08:00
Jeff King	3803a3a099	t: add --no-tag option to test_commit One of the conveniences that test_commit offers is making a tag for each commit. This makes it easy to refer to the commits in subsequent commands. But it can also be a pain if you care about reachability, because those tags keep the commits reachable even if they are rewound from the branch they're made on. The alternative is that scripts have to call test_tick, git-add, and git-commit themselves. Let's add a --no-tag option to give them the one-liner convenience of using test_commit. This is in preparation for the next patch, which will add some more calls. But I cleaned up an existing site to show off the feature. There are probably more cleanups possible. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 13:36:06 -08:00
Matheus Tavares	0c5d83b248	grep: error out if --untracked is used with --cached The options --untracked and --cached are not compatible, but if they are used together, grep just silently ignores --cached and searches the working tree. Error out, instead, to avoid any potential confusion. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 12:39:06 -08:00
Junio C Hamano	1d4f2316c5	Sync with 2.30.1	2021-02-08 14:44:42 -08:00
Charvi Mendiratta	1f9696019a	sequencer: rename a few functions Rename functions to make them more descriptive and while at it, remove unnecessary 'inline' of the skip_fixupish() function. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-08 13:09:57 -08:00
Charvi Mendiratta	a25314c1ec	sequencer: fixup the datatype of the 'flag' argument As 'flag' is a combination of bits, so change its datatype from 'enum todo_item_flags' to 'unsigned'. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-08 13:09:57 -08:00
Johannes Schindelin	2cc543deab	range-diff(docs): explain how to specify commit ranges There are three forms, depending whether the user specifies one, two or three non-option arguments. We've never actually explained how this works in the manual, so let's explain it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-06 21:24:55 -08:00
Johannes Schindelin	359f0d754a	range-diff/format-patch: handle commit ranges other than A..B In the `SPECIFYING RANGES` section of gitrevisions[7], two ways are described to specify commit ranges that `range-diff` does not yet accept: "<commit>^!" and "<commit>^-<n>". Let's accept them, by parsing them via the revision machinery and looking for at least one interesting and one uninteresting revision in the resulting `pending` array. This also finally lets us reject arguments that _do_ contain `..` but are not actually ranges, e.g. `HEAD^{/do.. match this}`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-06 21:24:55 -08:00
Johannes Schindelin	1e79f97326	range-diff: offer --left-only/--right-only options When comparing commit ranges, one is frequently interested only in one side, such as asking the question "Has this patch that I submitted to the Git mailing list been applied?": one would only care about the part of the output that corresponds to the commits in a local branch. To make that possible, imitate the `git rev-list` options `--left-only` and `--right-only`. This addresses https://github.com/gitgitgadget/git/issues/206 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-06 21:14:31 -08:00
Johannes Schindelin	3e6046edad	range-diff: move the diffopt initialization down one layer It is actually only the `output()` function that uses those diffopts. By moving the diffopt initialization down into that function, it is encapsulated better. Incidentally, it will also make it easier to implement the `--left-only` and `--right-only` options in `git range-diff` because the `output()` function is now receiving all range-diff options as a parameter, not just the diffopts. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-06 21:14:31 -08:00
Johannes Schindelin	f1ce6c191e	range-diff: combine all options in a single data structure This will make it easier to implement the `--left-only` and `--right-only` options. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-06 21:14:31 -08:00
Junio C Hamano	fb7fa4a1fd	Sync with maint	2021-02-05 16:41:17 -08:00
Junio C Hamano	4527ecdc8d	The sixth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 16:40:46 -08:00
Junio C Hamano	4513f6bbb1	Merge branch 'sg/test-stress-jobs' Test framework fix. * sg/test-stress-jobs: test-lib: prevent '--stress-jobs=X' from being ignored	2021-02-05 16:40:46 -08:00
Junio C Hamano	dfc3c2b224	Merge branch 'jk/weather-balloon-require-variadic-macro' We've carried compatibility codepaths for compilers without variadic macros for quite some time, but the world may be ready for them to be removed. Force compilation failure on exotic platforms where variadic macros are not available to find out who screams in such a way that we can easily revert if it turns out that the world is not yet ready. * jk/weather-balloon-require-variadic-macro: git-compat-util: always enable variadic macros	2021-02-05 16:40:46 -08:00
Junio C Hamano	b6c90a2a22	Merge branch 'pb/ci-matrix-wo-shortcut' Our setting of GitHub CI test jobs were a bit too eager to give up once there is even one failure found. Tweak the knob to allow other jobs keep running even when we see a failure, so that we can find more failures in a single run. * pb/ci-matrix-wo-shortcut: ci: do not cancel all jobs of a matrix if one fails	2021-02-05 16:40:46 -08:00
Junio C Hamano	61b159e219	Merge branch 'pb/blame-funcname-range-userdiff' Test fix. * pb/blame-funcname-range-userdiff: annotate-tests: quote variable expansions containing path names	2021-02-05 16:40:45 -08:00
Junio C Hamano	4cc0e8794d	Merge branch 'jk/p5303-sed-portability-fix' A perf script was made more portable. * jk/p5303-sed-portability-fix: p5303: avoid sed GNU-ism	2021-02-05 16:40:45 -08:00
Junio C Hamano	77db59c2f9	Merge branch 'jv/pack-objects-narrower-ref-iteration' The "pack-objects" command needs to iterate over all the tags when automatic tag following is enabled, but it actually iterated over all refs and then discarded everything outside "refs/tags/" hierarchy, which was quite wasteful. * jv/pack-objects-narrower-ref-iteration: builtin/pack-objects.c: avoid iterating all refs	2021-02-05 16:40:45 -08:00
Junio C Hamano	f6ef8baba2	Merge branch 'ph/use-delete-refs' When removing many branches and tags, the code used to do so one ref at a time. There is another API it can use to delete multiple refs, and it makes quite a lot of performance difference when the refs are packed. * ph/use-delete-refs: use delete_refs when deleting tags or branches	2021-02-05 16:40:45 -08:00
Junio C Hamano	6254fa1359	Merge branch 'tb/ls-refs-optim' The ls-refs protocol operation has been optimized to narrow the sub-hierarchy of refs/ it walks to produce response. * tb/ls-refs-optim: ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets ls-refs.c: initialize 'prefixes' before using it refs: expose 'for_each_fullref_in_prefixes'	2021-02-05 16:40:45 -08:00
Junio C Hamano	5198426d91	Merge branch 'zh/ls-files-deduplicate' "git ls-files" can and does show multiple entries when the index is unmerged, which is a source for confusion unless -s/-u option is in use. A new option --deduplicate has been introduced. * zh/ls-files-deduplicate: ls-files.c: add --deduplicate option ls_files.c: consolidate two for loops into one ls_files.c: bugfix for --deleted and --modified	2021-02-05 16:40:44 -08:00
Junio C Hamano	a0a2d75d3b	Merge branch 'ds/cache-tree-basics' Document, clean-up and optimize the code around the cache-tree extension in the index. * ds/cache-tree-basics: cache-tree: speed up consecutive path comparisons cache-tree: use ce_namelen() instead of strlen() index-format: discuss recursion of cache-tree better index-format: update preamble to cache tree extension index-format: use 'cache tree' over 'cached tree' cache-tree: trace regions for prime_cache_tree cache-tree: trace regions for I/O cache-tree: use trace2 in cache_tree_update() unpack-trees: add trace2 regions tree-walk: report recursion counts	2021-02-05 16:40:44 -08:00
Junio C Hamano	b65b9ff1ff	Merge branch 'en/ort-conflict-handling' ORT merge strategy learns more support for merge conflicts. * en/ort-conflict-handling: merge-ort: add handling for different types of files at same path merge-ort: copy find_first_merges() implementation from merge-recursive.c merge-ort: implement format_commit() merge-ort: copy and adapt merge_submodule() from merge-recursive.c merge-ort: copy and adapt merge_3way() from merge-recursive.c merge-ort: flesh out implementation of handle_content_merge() merge-ort: handle book-keeping around two- and three-way content merge merge-ort: implement unique_path() helper merge-ort: handle directory/file conflicts that remain merge-ort: handle D/F conflict where directory disappears due to merge	2021-02-05 16:40:44 -08:00
Junio C Hamano	aac006aa99	Merge branch 'so/log-diff-merge' "git log" learned a new "--diff-merges=<how>" option. * so/log-diff-merge: (32 commits) t4013: add tests for --diff-merges=first-parent doc/git-show: include --diff-merges description doc/rev-list-options: document --first-parent changes merges format doc/diff-generate-patch: mention new --diff-merges option doc/git-log: describe new --diff-merges options diff-merges: add '--diff-merges=1' as synonym for 'first-parent' diff-merges: add old mnemonic counterparts to --diff-merges diff-merges: let new options enable diff without -p diff-merges: do not imply -p for new options diff-merges: implement new values for --diff-merges diff-merges: make -m/-c/--cc explicitly mutually exclusive diff-merges: refactor opt settings into separate functions diff-merges: get rid of now empty diff_merges_init_revs() diff-merges: group diff-merge flags next to each other inside 'rev_info' diff-merges: split 'ignore_merges' field diff-merges: fix -m to properly override -c/--cc t4013: add tests for -m failing to override -c/--cc t4013: support test_expect_failure through ':failure' magic diff-merges: revise revs->diff flag handling diff-merges: handle imply -p on -c/--cc logic for log.c ...	2021-02-05 16:40:44 -08:00
Derrick Stolee	eb9071912f	commit-graph: anonymize data in chunk_write_fn In preparation for creating an API around file formats using chunks and tables of contents, prepare the commit-graph write code to use prototypes that will match this new API. Specifically, convert chunk_write_fn to take a "void *data" parameter instead of the commit-graph-specific "struct write_commit_graph_context" pointer. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 15:40:41 -08:00
Jonathan Tan	4f37d45706	clone: respect remote unborn HEAD Teach Git to use the "unborn" feature introduced in a previous patch as follows: Git will always send the "unborn" argument if it is supported by the server. During "git clone", if cloning an empty repository, Git will use the new information to determine the local branch to create. In all other cases, Git will ignore it. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 13:49:55 -08:00
Jonathan Tan	39835409d1	connect, transport: encapsulate arg in struct In a future patch we plan to return the name of an unborn current branch from deep in the callchain to a caller via a new pointer parameter that points at a variable in the caller when the caller calls get_remote_refs() and transport_get_remote_refs(). In preparation for that, encapsulate the existing ref_prefixes parameter into a struct. The aforementioned unborn current branch will go into this new struct in the future patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 13:49:54 -08:00
Jonathan Tan	59e1205d16	ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 13:49:53 -08:00
Thomas Ackermann	6eda9ac9e5	doc: use https links Use only https links for lore.kernel.org. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:10 -08:00
Thomas Ackermann	1d18997007	doc hash-function-transition: move rationale upwards Move rationale for new hash function to beginning of document so that it appears before the concrete move to SHA-256 is described. Remove some of the details about SHA-1 weaknesses and add references to the details on how the new hash function was chosen instead. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:10 -08:00
Thomas Ackermann	cc9f0916bd	doc hash-function-transition: fix incomplete sentence Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:02 -08:00
Thomas Ackermann	810372f881	doc hash-function-transition: use upper case consistently Use upper case consistently in Document History. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:02 -08:00
Thomas Ackermann	af9b1e9aba	doc hash-function-transition: use SHA-1 and SHA-256 consistently Use SHA-1 and SHA-256 instead of sha1 and sha256 when referring to the hash type. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:02 -08:00
Thomas Ackermann	de82095a95	doc hash-function-transition: fix asciidoc output Asciidoc requires lists to start with an empty line and uses different characters for indentation levels ("-", "", "*", ...). For special symbols like a dash "--" has to be used and there is no double arrow "<->", so a left and right arrow "<-->" has to be combined for that. Lastly for verbatim output a newline followed by an indentation has to be used. Fix asciidoc output for lists, special characters and verbatim text while retaining the readabilty of the original text file. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:02 -08:00
Johannes Schindelin	5189bb8724	range-diff: simplify code spawning `git log` Previously, we waited for the child process to be finished in every failing code path as well as at the end of the function `show_range_diff()`. However, we do not need to wait that long. Directly after reading the output of the child process, we can wrap up the child process. This also has the advantage that we don't do a bunch of unnecessary work in case `finish_command()` returns with an error anyway. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-04 17:16:42 -08:00
Johannes Schindelin	a2d474adf3	range-diff: libify the read_patches() function again In library functions, we do want to avoid the (simple, but rather final) `die()` calls, instead returning with a value indicating an error. Let's do exactly that in the code introduced in `b66885a30c` (range-diff: add section header instead of diff header, 2019-07-11) that wants to error out if a diff header could not be parsed. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-04 17:16:42 -08:00
Johannes Schindelin	8c29b49794	range-diff: avoid leaking memory in two error code paths In the code paths in question, we already release a lot of memory, but the `current_filename` variable was missed. Fix that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-04 17:16:42 -08:00
Junio C Hamano	30b29f044a	The fifth batch	2021-02-03 15:04:49 -08:00
Junio C Hamano	22f2bce651	Merge branch 'jk/run-command-use-shell-doc' The .use_shell flag in struct child_process that is passed to run_command() API has been clarified with a bit more documentation. * jk/run-command-use-shell-doc: run-command: document use_shell option	2021-02-03 15:04:49 -08:00
Junio C Hamano	973e20b83f	Merge branch 'jk/peel-iterated-oid' The peel_ref() API has been replaced with peel_iterated_oid(). * jk/peel-iterated-oid: refs: switch peel_ref() to peel_iterated_oid()	2021-02-03 15:04:49 -08:00
Junio C Hamano	6cd7f9dc29	Merge branch 'js/skip-dashed-built-ins-from-config-mak' Build fix. * js/skip-dashed-built-ins-from-config-mak: SKIP_DASHED_BUILT_INS: respect `config.mak`	2021-02-03 15:04:49 -08:00
Junio C Hamano	d03553ecd1	Merge branch 'jt/packfile-as-uri-doc' Doc fix for packfile URI feature. * jt/packfile-as-uri-doc: Doc: clarify contents of packfile sent as URI	2021-02-03 15:04:49 -08:00
Junio C Hamano	15bf48b987	Merge branch 'ds/maintenance-prefetch-cleanup' Test clean-up plus UI improvement by hiding extra refs that the prefetch task uses from "log --decorate" output. * ds/maintenance-prefetch-cleanup: t7900: clean up some broken refs maintenance: set log.excludeDecoration durin prefetch	2021-02-03 15:04:48 -08:00
Junio C Hamano	18e3f5a944	Merge branch 'ab/fsck-doc-fix' Documentation for "git fsck" lost stale bits that has become incorrect. * ab/fsck-doc-fix: fsck doc: remove ancient out-of-date diagnostics	2021-02-03 15:04:48 -08:00
Pranit Bauva	97b8294474	bisect--helper: retire `--check-and-set-terms` subcommand The `--check-and-set-terms` subcommand is no longer from the git-bisect.sh shell script. Instead the function `check_and_set_terms()` is called from the C implementation. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:09 -08:00
Pranit Bauva	e4c7b33747	bisect--helper: reimplement `bisect_skip` shell function in C Reimplement the `bisect_skip()` shell function in C and also add `bisect-skip` subcommand to `git bisect--helper` to call it from git-bisect.sh Using `--bisect-skip` subcommand is a temporary measure to port shell function to C so as to use the existing test suite. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:09 -08:00
Pranit Bauva	9feea34810	bisect--helper: retire `--bisect-auto-next` subcommand The --bisect-auto-next subcommand is no longer used from the git-bisect.sh shell script. Instead the function bisect_auto_next() is directly called from the C implementation. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:09 -08:00
Pranit Bauva	b7a6f163d6	bisect--helper: use `res` instead of return in BISECT_RESET case option Use `res` variable to store `bisect_reset()` output in BISECT_RESET case option to make bisect--helper.c more consistent. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:09 -08:00
Pranit Bauva	68efed8c8a	bisect--helper: retire `--bisect-write` subcommand The `--bisect-write` subcommand is no longer used from the git-bisect.sh shell script. Instead the function `bisect_write()` is directly called from the C implementation. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:08 -08:00
Pranit Bauva	2b1fd947f6	bisect--helper: reimplement `bisect_replay` shell function in C Reimplement the `bisect_replay` shell function in C and also add `--bisect-replay` subcommand to `git bisect--helper` to call it from git-bisect.sh Using `--bisect-replay` subcommand is a temporary measure to port shell function to C so as to use the existing test suite. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:08 -08:00
Pranit Bauva	97d5ba6a39	bisect--helper: reimplement `bisect_log` shell function in C Reimplement the `bisect_log()` shell function in C and also add `--bisect-log` subcommand to `git bisect--helper` to call it from git-bisect.sh . Using `--bisect-log` subcommand is a temporary measure to port shell function to C so as to use the existing test suite. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:08 -08:00
Jeff King	27dc071b9a	doc/git-branch: fix awkward wording for "-c" The description for "-c" is hard to parse. I think the big issue is lack of commas, but I've also reordered the words to keep the main focus point of "instead of renaming, copy" together. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:14:31 -08:00
Jeff King	bca362c1f9	completion: handle other variants of "branch -m" We didn't special-case "branch -M" (with a capital M) the same as "branch -m", nor any of the "--copy" variants. As a result these offered any ref as the next candidate, and not just branch names. Note that I rewrapped case-arm line since it's now quite long, and likewise the one below it for consistency. I also re-ordered the existing "-D" to make it more obvious how the cases group together. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:14:24 -08:00
Torsten Bögershausen	5c327502db	MacOS: precompose_argv_prefix() The following sequence leads to a "BUG" assertion running under MacOS: DIR=git-test-restore-p Adiarnfd=$(printf 'A\314\210') DIRNAME=xx${Adiarnfd}yy mkdir $DIR && cd $DIR && git init && mkdir $DIRNAME && cd $DIRNAME && echo "Initial" >file && git add file && echo "One more line" >>file && echo y \| git restore -p . Initialized empty Git repository in /tmp/git-test-restore-p/.git/ BUG: pathspec.c:495: error initializing pathspec_item Cannot close git diff-index --cached --numstat [snip] The command `git restore` is run from a directory inside a Git repo. Git needs to split the $CWD into 2 parts: The path to the repo and "the rest", if any. "The rest" becomes a "prefix" later used inside the pathspec code. As an example, "/path/to/repo/dir-inside-repå" would determine "/path/to/repo" as the root of the repo, the place where the configuration file .git/config is found. The rest becomes the prefix ("dir-inside-repå"), from where the pathspec machinery expands the ".", more about this later. If there is a decomposed form, (making the decomposing visible like this), "dir-inside-rep°a" doesn't match "dir-inside-repå". Git commands need to: (a) read the configuration variable "core.precomposeunicode" (b) precocompose argv[] (c) precompose the prefix, if there was any The first commit, `76759c7dff` "git on Mac OS and precomposed unicode" addressed (a) and (b). The call to precompose_argv() was added into parse-options.c, because that seemed to be a good place when the patch was written. Commands that don't use parse-options need to do (a) and (b) themselfs. The commands `diff-files`, `diff-index`, `diff-tree` and `diff` learned (a) and (b) in commit `90a78b83e0` "diff: run arguments through precompose_argv" Branch names (or refs in general) using decomposed code points resulting in decomposed file names had been fixed in commit `8e712ef6fc` "Honor core.precomposeUnicode in more places" The bug report from above shows 2 things: - more commands need to handle precomposed unicode - (c) should be implemented for all commands using pathspecs Solution: precompose_argv() now handles the prefix (if needed), and is renamed into precompose_argv_prefix(). Inside this function the config variable core.precomposeunicode is read into the global variable precomposed_unicode, as before. This reading is skipped if precomposed_unicode had been read before. The original patch for preocomposed unicode, `76759c7dff`, placed precompose_argv() into parse-options.c Now add it into git.c::run_builtin() as well. Existing precompose calls in diff-files.c and others may become redundant, and if we audit the callflows that reach these places to make sure that they can never be reached without going through the new call added to run_builtin(), we might be able to remove these existing ones. But in this commit, we do not bother to do so and leave these precompose callsites as they are. Because precompose() is idempotent and can be called on an already precomposed string safely, this is safer than removing existing calls without fully vetting the callflows. There is certainly room for cleanups - this change intends to be a bug fix. Cleanups needs more tests in e.g. t/t3910-mac-os-precompose.sh, and should be done in future commits. [1] git-bugreport-2021-01-06-1209.txt (git can't deal with special characters) [2] https://lore.kernel.org/git/A102844A-9501-4A86-854D-E3B387D378AA@icloud.com/ Reported-by: Daniel Troger <random_n0body@icloud.com> Helped-By: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:09:37 -08:00
Jeff King	a534cf4f4d	completion: treat "branch -D" the same way as "branch -d" The former offers not just branches but tags as completion candidates. Mimic how "branch -d" limits its suggestion to branch names. Reported-by: Paul Jolly <paul@myitcv.io> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-02 13:26:10 -08:00
Jacob Vosmaer	ad6b5fefbd	t5544: clarify 'hook works with partial clone' test Apply a few leftover improvements from the review of `ad5df6b782` (upload-pack.c: fix filter spec quoting bug). 1. Instead of enumerating objects reachable from HEAD, enumerate all reachable objects, because HEAD has not special significance in this test. 2. Instead of relying on the knowledge that "? in rev-list output means partial clone", explicitly verify that there are no blobs with cat-file. Signed-off-by: Jacob Vosmaer <jacob@gitlab.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-02 12:21:38 -08:00
Pratyush Yadav	7da7ef6d7a	Merge branch 'mk/russian-translation' Fix typo in Russian translation. * mk/russian-translation: git-gui: fix typo in russian locale	2021-02-02 23:51:30 +05:30
Mikhail Klyushin	413e96f41e	git-gui: fix typo in russian locale Fixed typo in russian locale: издекса -> индекса Signed-off-by: Mikhail Klyushin <klyushinmisha@gmail.com> Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>	2021-02-02 23:50:31 +05:30
Ævar Arnfjörð Bjarmason	be8fc53e36	pager: properly log pager exit code when signalled When git invokes a pager that exits with non-zero the common case is that we'll already return the correct SIGPIPE failure from git itself, but the exit code logged in trace2 has always been incorrectly reported[1]. Fix that and log the correct exit code in the logs. Since this gives us something to test outside of our recently-added tests needing a !MINGW prerequisite, let's refactor the test to run on MINGW and actually check for SIGPIPE outside of MINGW. The wait_or_whine() is only called with a true "in_signal" from from finish_command_in_signal(), which in turn is only used in pager.c. The "in_signal && !WIFEXITED(status)" case is not covered by tests. Let's log the default -1 in that case for good measure. 1. The incorrect logging of the exit code in was seemingly copy/pasted into finish_command_in_signal() in `ee4512ed48` (trace2: create new combined trace facility, 2019-02-22) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:15:58 -08:00
Ævar Arnfjörð Bjarmason	85db79a96e	run-command: add braces for "if" block in wait_or_whine() Add braces to an "if" block in the wait_or_whine() function. This isn't needed now, but will make a subsequent commit easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:15:58 -08:00
Ævar Arnfjörð Bjarmason	c24b7f6736	pager: test for exit code with and without SIGPIPE Add tests for how git behaves when the pager itself exits with non-zero, as well as for us exiting with 141 when we're killed with SIGPIPE due to the pager not consuming its output. There is some recent discussion[1] about these semantics, but aside from what we want to do in the future, we should have a test for the current behavior. This test construct is stolen from `7559a1be8a` (unblock and unignore SIGPIPE, 2014-09-18). The reason not to make the test itself depend on the MINGW prerequisite is to make a subsequent commit easier to read. 1. https://lore.kernel.org/git/87o8h4omqa.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:15:58 -08:00
Ævar Arnfjörð Bjarmason	61ff12fa50	pager: refactor wait_for_pager() function Refactor the wait_for_pager() function. Since `507d7804c0` (pager: don't use unsafe functions in signal handlers, 2015-09-04) the wait_for_pager() and wait_for_pager_atexit() callers diverged on more than they shared. Let's extract the common code into a new close_pager_fds() helper, and move the parts unique to the only to callers to those functions. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:15:58 -08:00
Derrick Stolee	bc50d6c91f	commit-graph: prepare commit graph Before checking if the repository has a commit-graph loaded, be sure to run prepare_commit_graph(). This is necessary because otherwise the topo_levels slab is not initialized. As we compute topo_levels for the new commits, we iterate further into the lower layers since the first visit to each commit looks as though the topo_level is not populated. By properly initializing the topo_slab, we fix the previously broken case of a split commit graph where a base layer has the generation_data_overflow chunk. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:03:36 -08:00
Derrick Stolee	fde55b0906	commit-graph: be extra careful about mixed generations When upgrading to a commit-graph with corrected commit dates from one without, there are a few things that need to be considered. When computing generation numbers for the new commit-graph file that expects to add the generation_data chunk with corrected commit dates, we need to ensure that the 'generation' member of the commit_graph_data struct is set to zero for these commits. Unfortunately, the fallback to use topological level for generation number when corrected commit dates are not available are causing us harm here: parsing commits notices that read_generation_data is false and populates 'generation' with the topological level. The solution is to iterate through the commits, parse the commits to populate initial values, then reset the generation values to zero to trigger recalculation. This loop only occurs when the existing commit-graph data has no corrected commit dates. While this improves our situation somewhat, we have not completely solved the issue for correctly computing generation numbers for mixed layers. That follows in the next change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:03:36 -08:00
Derrick Stolee	9c2c0a8256	commit-graph: compute generations separately The compute_generation_numbers() method was introduced by `3258c663` (commit-graph: compute generation numbers, 2018-05-01) to compute what is now known as "topological levels". These are still stored in the commit-graph file for compatibility sake while `c1a09119` (commit-graph: implement corrected commit date, 2021-01-16) updated the method to also compute the new version of generation numbers: corrected commit date. It makes sense why these are grouped. They perform very similar walks of the necessary commits and compute similar maximums over each parent. However, having these two together conflates them in subtle ways that is hard to separate. In particular, the topo_level slab is used to store the topological levels in all cases, but the commit_graph_data_at(c)->generation member stores different values depending on the state of the existing commit-graph file. * If the existing commit-graph file has a "GDAT" chunk, then these values represent corrected commit dates. * If the existing commit-graph file doesn't have a "GDAT" chunk, then these values are actually the topological levels. This issue only occurs only when upgrading an existing commit-graph file into one that has the "GDAT" chunk. The current change does not resolve this upgrade problem, but splitting the implementation into two pieces here helps with that process, which will follow in the next change. The important thing this helps with is the case where the num_generation_data_overflows was being incremented incorrectly, triggering a write of the overflow chunk. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:03:36 -08:00
Derrick Stolee	448a39e65d	commit-graph: validate layers for generation data We need to be extra careful that we don't use corrected commit dates from any layer of a commit-graph chain if there is a single commit-graph file that is missing the generation_data chunk. Update validate_mixed_generation_chain() to correctly update each layer to ignore the generation_data chunk in this case. It now also returns 1 if all layers have a generation_data chunk. This return value will be used in the next change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:03:36 -08:00
Derrick Stolee	90cb1c47c7	commit-graph: always parse before commit_graph_data_at() There is a subtle failure happening when computing corrected commit dates with --split enabled. It requires a base layer needing the generation_data_overflow chunk. Then, the next layer on top erroneously thinks it needs an overflow chunk due to a bug leading to recalculating all reachable generation numbers. The output of the failure is BUG: commit-graph.c:1912: expected to write 8 bytes to chunk 47444f56, but wrote 0 instead These "expected" 8 bytes are due to re-computing the corrected commit date for the lower layer but the new layer does not need any overflow. Add a test to t5318-commit-graph.sh that demonstrates this bug. However, it does not trigger consistently with the existing code. The generation number data is stored in a slab and accessed by commit_graph_data_at(). This data is initialized when parsing a commit, but is otherwise used assuming it has been populated. The loop in compute_generation_numbers() did not enforce that all reachable commits were parsed and had correct values. This could lead to some problems when writing a commit-graph with corrected commit dates based on a commit-graph without them. It has been difficult to identify the issue here because it was so hard to reproduce. It relies on this uninitialized data having a non-zero value, but also on specifically in a way that overwrites the existing data. This patch adds the extra parse to ensure the data is filled before we compute the generation number of a commit. This triggers the new test to fail because the generation number overflow count does not match between this computation and the write for that chunk. The actual fix will follow as the next few changes. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:03:36 -08:00
Derrick Stolee	c4cc083169	commit-graph: use repo_parse_commit The write_commit_graph_context has a repository pointer, so use it. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 21:03:35 -08:00
Derrick Stolee	0fac156523	commit-reach: reduce requirements for remove_redundant() Remove a comment at the beggining of remove_redundant() that mentions a reordering of the input array to have the initial segment be the independent commits and the final segment be the redundant commits. While this behavior is followed in remove_redundant(), no callers rely on that behavior. Remove the final loop that copies this final segment and update the comment to match the new behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-01 11:50:33 -08:00
Rafael Silva	076b444a62	worktree: teach `list` verbose mode "git worktree list" annotates each worktree according to its state such as "prunable" or "locked", however it is not immediately obvious why these worktrees are being annotated. For prunable worktrees a reason is available that is returned by should_prune_worktree() and for locked worktrees a reason might be available provided by the user via `lock` command. Let's teach "git worktree list" a --verbose mode that outputs the reason why the worktrees are being annotated. The reason is a text that can take virtually any size and appending the text on the default columned format will make it difficult to extend the command with other annotations and not fit nicely on the screen. In order to address this shortcoming the annotation is then moved to the next line indented followed by the reason If the reason is not available the annotation stays on the same line as the worktree itself. The output of "git worktree list" with verbose becomes like so: $ git worktree list --verbose ... /path/to/locked-no-reason acb124 [branch-a] locked /path/to/locked-with-reason acc125 [branch-b] locked: worktree with a locked reason /path/to/prunable-reason ace127 [branch-d] prunable: gitdir file points to non-existent location ... Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:40 -08:00
Rafael Silva	9b19a58f66	worktree: teach `list` to annotate prunable worktree The "git worktree list" command shows the absolute path to the worktree, the commit that is checked out, the name of the branch, and a "locked" annotation if the worktree is locked, however, it does not indicate whether the worktree is prunable. The "prune" command will remove a worktree if it is prunable unless `--dry-run` option is specified. This could lead to a worktree being removed without the user realizing before it is too late, in case the user forgets to pass --dry-run for instance. If the "list" command shows which worktree is prunable, the user could verify before running "git worktree prune" and hopefully prevents the working tree to be removed "accidentally" on the worse case scenario. Let's teach "git worktree list" to show when a worktree is a prunable candidate for both default and porcelain format. In the default format a "prunable" text is appended: $ git worktree list /path/to/main aba123 [main] /path/to/linked 123abc [branch-a] /path/to/prunable ace127 (detached HEAD) prunable In the --porcelain format a prunable label is added followed by its reason: $ git worktree list --porcelain ... worktree /path/to/prunable HEAD abc1234abc1234abc1234abc1234abc1234abc12 detached prunable gitdir file points to non-existent location ... Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:35 -08:00
Rafael Silva	862c723d18	worktree: teach `list --porcelain` to annotate locked worktree Commit `c57b3367be` (worktree: teach `list` to annotate locked worktree, 2020-10-11) taught "git worktree list" to annotate locked worktrees by appending "locked" text to its output, however, this is not listed in the --porcelain format. Teach "list --porcelain" to do the same and add a "locked" attribute followed by its reason, thus making both default and porcelain format consistent. If the locked reason is not available then only "locked" is shown. The output of the "git worktree list --porcelain" becomes like so: $ git worktree list --porcelain ... worktree /path/to/locked HEAD 123abcdea123abcd123acbd123acbda123abcd12 detached locked worktree /path/to/locked-with-reason HEAD abc123abc123abc123abc123abc123abc123abc1 detached locked reason why it is locked ... In porcelain mode, if the lock reason contains special characters such as newlines, they are escaped with backslashes and the entire reason is enclosed in double quotes. For example: $ git worktree list --porcelain ... locked "worktree's path mounted in\nremovable device" ... Furthermore, let's update the documentation to state that some attributes in the porcelain format might be listed alone or together with its value depending whether the value is available or not. Thus documenting the case of the new "locked" attribute. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:29 -08:00
Rafael Silva	47409e75f5	t2402: ensure locked worktree is properly cleaned up `c57b3367be` (worktree: teach `list` to annotate locked worktree, 2020-10-11) introduced a new test to ensure locked worktrees are listed with "locked" annotation. However, the test does not clean up after itself as "git worktree prune" is not going to remove the locked worktree in the first place. This not only leaves the test in an unclean state it also potentially breaks following tests that rely on the "git worktree list" output. Let's fix that by unlocking the worktree before the "prune" command. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:24 -08:00
Rafael Silva	eb36135af7	worktree: teach worktree_lock_reason() to gently handle main worktree worktree_lock_reason() aborts with an assertion failure when called on the main worktree since locking the main worktree is nonsensical. Not only is this behavior undocumented, thus callers might not even be aware that the call could potentially crash the program, but it also forces clients to be extra careful: if (!is_main_worktree(wt) && worktree_locked_reason(...)) ... Since we know that locking makes no sense in the context of the main worktree, we can simply return false for the main worktree, thus making client code less complex by eliminating the need for the callers to have inside knowledge about the implementation: if (worktree_lock_reason(...)) ... Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:20 -08:00
Rafael Silva	fc0c7d5e9e	worktree: teach worktree to lazy-load "prunable" reason Add worktree_prune_reason() to allow a caller to discover whether a worktree is prunable and the reason that it is, much like worktree_lock_reason() indicates whether a worktree is locked and the reason for the lock. As with worktree_lock_reason(), retrieve the prunable reason lazily and cache it in the `worktree` structure. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:16 -08:00
Rafael Silva	a29a8b7574	worktree: libify should_prune_worktree() As part of teaching "git worktree list" to annotate worktree that is a candidate for pruning, let's move should_prune_worktree() from builtin/worktree.c to worktree.c in order to make part of the worktree public API. should_prune_worktree() knows how to select the given worktree for pruning based on an expiration date, however the expiration value is stored in a static file-scope variable and it is not local to the function. In order to move the function, teach should_prune_worktree() to take the expiration date as an argument and document the new parameter that is not immediately obvious. Also, change the function comment to clearly state that the worktree's path is returned in `wtpath` argument. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:08 -08:00
Charvi Mendiratta	2c0aa2ce2e	doc/git-rebase: add documentation for fixup [-C\|-c] options Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Marc Branchaud <marcnarc@xiplink.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-29 15:21:56 -08:00
Charvi Mendiratta	bae5b4aea5	rebase -i: teach --autosquash to work with amend! If the commit subject starts with "amend!" then rearrange it like a "fixup!" commit and replace `pick` command with `fixup -C` command, which is used to fixup up the content if any and replaces the original commit message with amend! commit's message. Original-patch-by: Phillip Wood <phillip.wood@dunelm.org.uk> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-29 15:21:56 -08:00
Charvi Mendiratta	1d410cd8c2	t3437: test script for fixup [-C\|-c] options in interactive rebase Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-29 15:21:56 -08:00
Charvi Mendiratta	9e3cebd97c	rebase -i: add fixup [-C \| -c] command Add options to `fixup` command to fixup both the commit contents and message. `fixup -C` command is used to replace the original commit message and `fixup -c`, additionally allows to edit the commit message. Original-patch-by: Phillip Wood <phillip.wood@dunelm.org.uk> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-29 15:21:56 -08:00
Charvi Mendiratta	71ee81cd9e	sequencer: use const variable for commit message comments This makes it easier to use and reuse the comments. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-29 15:21:56 -08:00
Charvi Mendiratta	ae70e34f23	sequencer: pass todo_item to do_pick_commit() As an additional member of the structure todo_item will be required in future commits pass the complete structure. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-29 15:21:56 -08:00
Phillip Wood	7cdb968254	rebase -i: comment out squash!/fixup! subjects from squash message When squashing commit messages the squash!/fixup! subjects are not of interest so comment them out to stop them becoming part of the final message. This change breaks a bunch of --autosquash tests which rely on the "squash! <subject>" line appearing in the final commit message. This is addressed by adding a second line to the commit message of the "squash! ..." commits and testing for that. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-29 15:21:56 -08:00
Dimitriy Ryazantcev	a093f0ba95	l10n: ru.po: update Russian translation Kudos to Philipp Bartsch for whitespace fixes and his helper script[1]. [1]: https://git.grmr.de/phil/pocheck Signed-off-by: Dimitriy Ryazantcev <dimitriy.ryazantcev@gmail.com>	2021-01-29 21:45:17 +02:00
Taylor Blau	6885cd7dc5	t5325: check both on-disk and in-memory reverse index Right now, the test suite can be run with 'GIT_TEST_WRITE_REV_INDEX=1' in the environment, which causes all operations which write a pack to also write a .rev file. To prepare for when that eventually becomes the default, we should continue to test the in-memory reverse index, too, in order to avoid losing existing coverage. Unfortunately, explicit existing coverage is rather sparse, so only a basic test is added that compares the result of git rev-list --objects --no-object-names --all \| git cat-file --batch-check='%(objectsize:disk) %(objectname)' with and without an on-disk reverse index. Suggested-by: Jeff King <peff@peff.net> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 22:51:51 -08:00
Jeff King	018b9deba5	pretty: lazy-load commit data when expanding user-format When we expand a user-format, we try to avoid work that isn't necessary for the output. For instance, we don't bother parsing the commit header until we know we need the author, subject, etc. But we do always load the commit object's contents from disk, even if the format doesn't require it (e.g., just "%H"). Traditionally this didn't matter much, because we'd have loaded it as part of the traversal anyway, and we'd typically have those bytes attached to the commit struct (or these days, cached in a commit-slab). But when we have a commit-graph, we might easily get to the point of pretty-printing a commit without ever having looked at the actual object contents. We should push off that load (and reencoding) until we're certain that it's needed. I think the results of p4205 show the advantage pretty clearly (we serve parent and tree oids out of the commit struct itself, so they benefit as well): # using git.git as the test repo Test HEAD^ HEAD ---------------------------------------------------------------------- 4205.1: log with %H 0.40(0.39+0.01) 0.03(0.02+0.01) -92.5% 4205.2: log with %h 0.45(0.44+0.01) 0.09(0.09+0.00) -80.0% 4205.3: log with %T 0.40(0.39+0.00) 0.04(0.04+0.00) -90.0% 4205.4: log with %t 0.46(0.46+0.00) 0.09(0.08+0.01) -80.4% 4205.5: log with %P 0.39(0.39+0.00) 0.03(0.03+0.00) -92.3% 4205.6: log with %p 0.46(0.46+0.00) 0.10(0.09+0.00) -78.3% 4205.7: log with %h-%h-%h 0.52(0.51+0.01) 0.15(0.14+0.00) -71.2% 4205.8: log with %an-%ae-%s 0.42(0.41+0.00) 0.42(0.41+0.01) +0.0% # using linux.git as the test repo Test HEAD^ HEAD ---------------------------------------------------------------------- 4205.1: log with %H 7.12(6.97+0.14) 0.76(0.65+0.11) -89.3% 4205.2: log with %h 7.35(7.19+0.16) 1.30(1.19+0.11) -82.3% 4205.3: log with %T 7.58(7.42+0.15) 1.02(0.94+0.08) -86.5% 4205.4: log with %t 8.05(7.89+0.15) 1.55(1.41+0.13) -80.7% 4205.5: log with %P 7.12(7.01+0.10) 0.76(0.69+0.07) -89.3% 4205.6: log with %p 7.38(7.27+0.10) 1.32(1.20+0.12) -82.1% 4205.7: log with %h-%h-%h 7.81(7.67+0.13) 1.79(1.67+0.12) -77.1% 4205.8: log with %an-%ae-%s 7.90(7.74+0.15) 7.81(7.66+0.15) -1.1% I added the final test to show where we don't improve (the 1% there is just lucky noise), but also as a regression test to make sure we're not doing anything stupid like loading the commit multiple times when there are several placeholders that need it. Reported-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 14:07:35 -08:00
Johannes Schindelin	f7d42ceec5	rebase -i: do leave commit message intact in fixup! chains In `6e98de72c0` (sequencer (rebase -i): add support for the 'fixup' and 'squash' commands, 2017-01-02), this developer introduced a change of behavior by mistake: when encountering a `fixup!` commit (or multiple `fixup!` commits) without any `squash!` commit thrown in, the final `git commit` was invoked with `--cleanup=strip`. Prior to that commit, the commit command had been called without that `--cleanup` option. Since we explicitly read the original commit message from a file in that case, there is really no sense in forcing that clean-up. We actually need to actively suppress that clean-up lest a configured `commit.cleanup` may interfere with what we want to do: leave the commit message unchanged. Reported-by: Vojtěch Knyttl <vojtech@knyt.tl> Helped-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 12:12:37 -08:00
Jeff King	30291525d9	t0000: consistently use single quotes for outer tests When we use the sub-test helpers, we end up defining one shell snippet inside another shell snippet. So if we use single-quotes for the outer snippet, we have to use double-quotes within the inner snippet (it's included as here-doc within the outer snippet, but using a single quote would end the outer snippet early). Or vice versa we can use double quotes for the outer snippet, but then single quotes in the inner. We have some of each in the script, and neither is wrong. But it would be nice to be consistent unless there is a good reason not to. Using single quotes for the outer script is preferable, because it requires less metacharacter quoting overall. For example, in: test_expect_success 'outer' ' run_sub_test_lib_test ... <<-\EOF echo $foo && test_expect_success "inner" " echo \$bar " EOF ' we need only quote inside "inner", but not inside "outer" or the here-doc. Whereas if we flip them, we have to quote in both places: test_expect_success 'outer' " run_sub_test_lib_test ... <<-\EOF echo \$foo && test_expect_success 'inner' ' echo \$bar ' EOF " The exception is when we need a literal single-quote in an expected output here-doc. There we can either use outer double-quotes, or just use ${SQ} within the doc. I chose the latter for consistency (within this test, but also with other test scripts that face the same problem). There is one other interesting case, which is some tests that do: test_expect_success ... " do_something --run='"'!3'"' " This is rather confusing to read, but is correct. The outer script sees '!3' in single-quotes, as does the eval'd snippet. This is perhaps being overly cautious. In many interactive shells, an exclamation triggers history expansion even inside double quotes, but that is not generally true in non-interactive shells. There's some conflicting information here. Commit `784ce03d55` (t4216: avoid unnecessary subshell in test_bloom_filters_not_used, 2020-05-19) reports it as a problem with OpenBSD 6.7's /bin/sh. However, we have many instances in this script of prereqs like !LAZY_TRUE, which haven't been a problem. I left them un-escaped here to test out this theory. It's much nicer if we can not worry about this as a portability issue, so it's worth knowing. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 12:06:26 -08:00
Jeff King	080e295248	t0000: run cleaning test inside sub-test Our check of test_when_finished is done directly in the main script, and if we failed to clean, we complain and exit immediately. It's nicer to signal a test failure here, for a few reasons: - this gives better output to the user when run under a TAP harness like "prove" - constency; it's the only test left in the file that behaves this way - half of its "if" conditional is nonsense anyway; it picked up a reference to GIT_TEST_FAIL_PREREQS_INTERNAL in `dfe1a17df9` (tests: add a special setup where prerequisites fail, 2019-05-13) along with its neighbors, even though it has nothing to do with that flag We could actually do this without a sub-test at all, and just put our two tests (one to do cleanup, and one to check that it happened) in the main script. But doing it in a subtest is conceptually cleaner (from the perspective of the main test script, we are checking only one thing), and it remains consistent with the "cleanup when failing" test directly after it, which has to happen in a sub-test (to avoid the main script complaining of the failed test). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 12:06:26 -08:00
Jeff King	efd2600e6f	t0000: run prereq tests inside sub-test We test the behavior of prerequisites in t0000 by setting up fake ones in the main test script, trying to run some tests, and then seeing if those tests impacted the environment correctly. If they didn't, then we write a message and manually call exit. Instead, let's push these down into a sub-test, like many of the other tests covering the framework itself. This has a few advantages: - it does not pollute the test output with mention of skipped tests (that we know are uninteresting -- the point of the test was to see that these are skipped). - when running in a TAP harness, we get a useful test failure message (whereas when the script exits early, a tool like "prove" simply says "Dubious, test returned 1"). - we do not have to worry about different test environments, such as when GIT_TEST_FAIL_PREREQS_INTERNAL is set. Our sub-test helpers already give us a known environment. - the tests themselves are a bit easier to read, as we can just check the test-framework output to see what happened (and get the usual test_cmp diff if it failed) A few notes on the implementation: - we could do one sub-test per each individual test_expect_success. I broke it up here into a few logical groups, as I think this makes it more readable - the original tests modified environment variables inside the test bodies. Instead, I've used "true" as the body of a test we expect to run and "false" otherwise. Technically this does not confirm that the body of the "true" test actually ran. We are trusting the framework output to believe that it truly ran, which is sufficient for these tests. And I think the end result is much simpler to follow. - the nested_prereq test uses a few bare "test -f" calls; I converted these to our usual test_path_is_* helpers while moving the code around. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 12:06:26 -08:00
Jeff King	03efadb774	t0000: keep clean-up tests together We check that test_when_finished cleans up after a test, and that it runs even after a failure. Those two were originally adjacent, but got split apart by the new test added in `477dcaddb6` (tests: do not let lazy prereqs inside `test_expect_*` turn off tracing, 2020-03-26), and then further by more lazy-prereq tests. Let's move them back together. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 12:06:25 -08:00
Jeff King	8380dcd700	oid_pos(): access table through const pointers When we are looking up an oid in an array, we obviously don't need to write to the array. Let's mark it as const in the function interfaces, as well as in the local variables we use to derference the void pointer (note a few cases use pointers-to-pointers, so we mark everything const). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 12:03:26 -08:00
Jeff King	45ee13b942	hash_pos(): convert to oid_pos() All of our callers are actually looking up an object_id, not a bare hash. Likewise, the arrays they are looking in are actual arrays of object_id (not just raw bytes of hashes, as we might find in a pack .idx; those are handled by bsearch_hash()). Using an object_id gives us more type safety, and makes the callers slightly shorter. It also gets rid of the word "sha1" from several access functions, though we could obviously also rename those with s/sha1/hash/. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 12:02:39 -08:00
Jeff King	680ff910b0	rerere: use strmap to store rerere directories We store a struct for each directory we access under .git/rr-cache. The structs are kept in an array sorted by the binary hash associated with their name (and we do lookups with a binary search). This works OK, but there are a few small downsides: - the amount of code isn't huge, but it's more than we'd need using one of our other stock data structures - the insertion into a sorted array is quadratic (though in practice it's unlikely anybody has enough conflicts for this to matter) - it's intimately tied to the representation of an object hash. This isn't a big deal, as the conflict ids we generate use the same hash, but it produces a few awkward bits (e.g., we are the only user of hash_pos() that is not using object_id). Let's instead just treat the directory names as strings, and store them in a strmap. This is less code, and removes the use of hash_pos(). Insertion is now non-quadratic, though we probably use a bit more memory. Besides the hash table overhead, and storing hex bytes instead of a binary hash, we actually store each name twice. Other code expects to access the name of a rerere_dir struct from the struct itself, so we need a copy there. But strmap keeps its own copy of the name, as well. Using a bare hashmap instead of strmap means we could use the name for both, but at the cost of extra code (e.g., our own comparison function). Likewise, strmap has a feature to use a pointer to the in-struct name at the cost of a little extra code. I didn't do either here, as simple code seemed more important than squeezing out a few bytes of efficiency. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 11:26:20 -08:00
Jeff King	098c173f2b	rerere: tighten rr-cache dirname check We check only that get_sha1_hex() doesn't complain, which means we'd match an all-hex name with trailing cruft after it. This probably doesn't matter much in practice, since there shouldn't be anything else in the rr-cache directory, but it could possibly cause us to mix up sha1 and sha256 entries (which also shouldn't be intermingled, but could be leftovers from a repository conversion). Note that "get_sha1_hex()" is a confusing historical name. It is actually using the_hash_algo, so it would be sha256 in a sha256 repo. We'll switch to using parse_oid_hex(), because that conveniently advances our pointer. But it also gets rid of the sha1 name. Arguably it's a little funny to use "object_id" here for something that isn't actually naming an object, but it's unlikely to be a problem (and is contained in a single function). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 11:25:43 -08:00
Jeff King	2bc1a87e42	rerere: check dirname format while iterating rr_cache directory In rerere_gc(), we walk over the .git/rr_cache directory and create a struct for each entry we find. We feed any name we get from readdir() to find_rerere_dir(), which then calls get_sha1_hex() on it (since we use the binary hash as a lookup key). If that fails (i.e., the directory name is not what we expected), it returns NULL. But the comment in find_rerere_dir() says "BUG". It _would_ be a bug for the call from new_rerere_id_hex(), the only other code path, to fail here; it's generating the hex internally. But the call in rerere_gc() is using it say "is this a plausible directory name". Let's instead have rerere_gc() do its own "is this plausible" check. That has two benefits: - we can now reliably BUG() inside find_rerere_dir(), which would catch bugs in the other code path (and we now will never return NULL from the function, which makes it easier to see that a rerere_id struct will always have a non-NULL "collection" field). - it makes the use of the binary hash an implementation detail of find_rerere_dir(), not known by callers. That will free us up to change it in a future patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 11:21:27 -08:00
Jeff King	98c431b6f9	commit_graft_pos(): take an oid instead of a bare hash All of our callers have an object_id, and are just dereferencing the hash field to pass to us. Let's take the actual object_id instead. We still access the hash to pass to hash_pos, but it's a step in the right direction. This makes the callers slightly simpler, but also gets rid of the untyped pointer, as well as the now-inaccurate name "sha1". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 11:21:07 -08:00
Jacob Vosmaer	ad5df6b782	upload-pack.c: fix filter spec quoting bug Fix a bug in upload-pack.c that occurs when you combine partial clone and uploadpack.packObjectsHook. You can reproduce it as follows: git clone -u 'git -c uploadpack.allowfilter '\ '-c uploadpack.packobjectshook=env '\ 'upload-pack' --filter=blob:none --no-local \ src.git dst.git Be careful with the line endings because this has a long quoted string as the -u argument. The error I get when I run this is: Cloning into '/tmp/broken'... remote: fatal: invalid filter-spec ''blob:none'' error: git upload-pack: git-pack-objects died with error. fatal: git upload-pack: aborting due to possible repository corruption on the remote side. remote: aborting due to possible repository corruption on the remote side. fatal: early EOF fatal: index-pack failed The problem is caused by unneeded quoting. This bug was already present in `10ac85c785` (upload-pack: add object filtering for partial clone, 2017-12-08) when the server side filter support was introduced. In fact, in `10ac85c785` this was broken regardless of uploadpack.packObjectsHook. Then in `0b6069fe0a` (fetch-pack: test support excluding large blobs, 2017-12-08) the quoting was removed but only behind a conditional that depends on whether uploadpack.packObjectsHook is set. Because uploadpack.packObjectsHook is apparently rarely used, nobody noticed the problematic quoting could still happen. Remove the conditional quoting and add a test for partial clone in t5544-pack-objects-hook. Signed-off-by: Jacob Vosmaer <jacob@gitlab.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 09:40:24 -08:00
Jeff King	765dc16888	git-compat-util: always enable variadic macros We allow variadic macros in the code base, but only if there is fallback code for platforms that lack it. This leads to some annoyances: - the code is more complicated because of the fallbacks (e.g., trace_printf(), etc, is implemented twice with a set of parallel wrappers). - some constructs are just impossible and we've had to live without them (e.g., a cross between FLEX_ALLOC and xstrfmt) Since this feature is present in C99, we may be able to start counting on it being available everywhere. Let's start with a weather balloon patch to find out. This patch makes the absolute minimal change by always setting HAVE_VARIADIC_MACROS. If somebody runs into a platform where it's a problem, they can undo it by commenting out the define. Likewise, if we have to revert this, it would be quite unlikely to cause conflicts. Once we feel comfortable that this is the right direction, then we can start ripping out all the spots that actually look at the flag, and removing the dead code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-27 22:14:37 -08:00
Johannes Schindelin	679b5916cd	range-diff/format-patch: refactor check for commit range Currently, when called with exactly two arguments, `git range-diff` tests for a literal `..` in each of the two. Likewise, the argument provided via `--range-diff` to `git format-patch` is checked in the same manner. However, `<commit>^!` is a perfectly valid commit range, equivalent to `<commit>^..<commit>` according to the `SPECIFYING RANGES` section of gitrevisions[7]. In preparation for allowing more sophisticated ways to specify commit ranges, let's refactor the check into its own function. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-27 22:01:49 -08:00
SZEDER Gábor	134768cf53	test-lib: prevent '--stress-jobs=X' from being ignored './t1234-foo.sh --stress-jobs=X ...' is supposed to run that test script in X parallel jobs, but the number of jobs specified on the command line is entirely ignored if other '--stress'-related options follow. I.e. both './t1234-foo.sh --stress-jobs=X --stress-limit=Y' and './t1234-foo.sh --stress-jobs=X --stress' fall back to using twice the number of CPUs parallel jobs instead. The former has been broken since commit `de69e6f6c9` (tests: let --stress-limit=<N> imply --stress, 2019-03-03) [1], which started to unconditionally overwrite the $stress variable holding the specified number of jobs in its effort to imply '--stress'. The latter has been broken since `f545737144` (tests: introduce --stress-jobs=<N>, 2019-03-03), because it didn't consider that handling '--stress' will overwrite that variable as well. We could fix this by being more careful about (over)writing that $stress variable and checking first whether it has already been set. But I think it's cleaner to use a dedicated variable to hold the number of specified parallel jobs, so let's do that instead. [1] In `de69e6f6c9` there was no '--stress-jobs=X' option yet, the number of parallel jobs had to be specified via '--stress=X', so, strictly speaking, `de69e6f6c9` broke './t1234-foo.sh --stress=X --stress-limit=Y'. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-26 17:58:33 -08:00
Ævar Arnfjörð Bjarmason	15c9649730	grep/log: remove hidden --debug and --grep-debug options Remove the hidden "grep --debug" and "log --grep-debug" options added in `17bf35a3c7` (grep: teach --debug option to dump the parse tree, 2012-09-13). At the time these options seem to have been intended to go along with a documentation discussion and to help the author of relevant tests to perform ad-hoc debugging on them[1]. Reasons to want this gone: 1. They were never documented, and the only (rather trivial) use of them in our own codebase for testing is something I removed back in `e01b4dab01` (grep: change non-ASCII -i test to stop using --debug, 2017-05-20). 2. Googling around doesn't show any in-the-wild uses I could dig up, and on the Git ML the only mentions after the original discussion seem to have been when they came up in unrelated diff contexts, or that test commit of mine. 3. An exception to that is `c581e4a749` (grep: under --debug, show whether PCRE JIT is enabled, 2019-08-18) where we added the ability to dump out when PCREv2 has the JIT in effect. The combination of that and my earlier `b65abcafc7` (grep: use PCRE v2 for optimized fixed-string search, 2019-07-01) means Git prints this out in its most common in-the-wild configuration: $ git log --grep-debug --grep=foo --grep=bar --grep=baz --all-match pcre2_jit_on=1 pcre2_jit_on=1 pcre2_jit_on=1 [all-match] (or pattern_body<body>foo (or pattern_body<body>bar pattern_body<body>baz ) ) $ git grep --debug $ -e foo --and -e bar $ --or -e baz pcre2_jit_on=1 pcre2_jit_on=1 pcre2_jit_on=1 (or (and patternfoo patternbar ) patternbaz ) I.e. for each pattern we're considering for the and/or/--all-match etc. debugging we'll now diligently spew out another identical line saying whether the PCREv2 JIT is on or not. I think that nobody's complained about that rather glaringly obviously bad output says something about how much this is used, i.e. it's not. The need for this debugging aid for the composed grep/log patterns seems to have passed, and the desire to dump the JIT config seems to have been another one-off around the time we had JIT-related issues on the PCREv2 codepath. That the original author of this debugging facility seemingly hasn't noticed the bad output since then[2] is probably some indicator. 1. https://lore.kernel.org/git/cover.1347615361.git.git@drmicha.warpmail.net/ 2. https://lore.kernel.org/git/xmqqk1b8x0ac.fsf@gitster-ct.c.googlers.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-26 11:36:20 -08:00
Taylor Blau	ec8e7760ac	pack-revindex: ensure that on-disk reverse indexes are given precedence When an on-disk reverse index exists, there is no need to generate one in memory. In fact, doing so can be slow, and require large amounts of the heap. Let's make sure that we treat the on-disk reverse index with precedence (i.e., that when it exists, we don't bother trying to generate an equivalent one in memory) by teaching Git how to conditionally die() when generating a reverse index in memory. Then, add a test to ensure that when (a) an on-disk reverse index exists, and (b) when setting GIT_TEST_REV_INDEX_DIE_IN_MEMORY, that we do not die, implying that we read from the on-disk one. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:44 -08:00
Taylor Blau	e8c58f894b	t: support GIT_TEST_WRITE_REV_INDEX Add a new option that unconditionally enables the pack.writeReverseIndex setting in order to run the whole test suite in a mode that generates on-disk reverse indexes. Additionally, enable this mode in the second run of tests under linux-gcc in 'ci/run-build-and-tests.sh'. Once on-disk reverse indexes are proven out over several releases, we can change the default value of that configuration to 'true', and drop this patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:44 -08:00
Taylor Blau	35a8a3547a	t: prepare for GIT_TEST_WRITE_REV_INDEX In the next patch, we'll add support for unconditionally enabling the 'pack.writeReverseIndex' setting with a new GIT_TEST_WRITE_REV_INDEX environment variable. This causes a little bit of fallout with tests that, for example, compare the list of files in the pack directory being unprepared to see .rev files in its output. Those locations can be cleaned up to look for specific file extensions, rather than take everything in the pack directory (for instance) and then grep out unwanted items. Once the pack.writeReverseIndex option has been thoroughly tested, we will default it to 'true', removing GIT_TEST_WRITE_REV_INDEX, and making it possible to revert this patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:44 -08:00
Taylor Blau	1615c567b8	Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Now that the pack.writeReverseIndex configuration is respected in both 'git index-pack' and 'git pack-objects' (and therefore, all of their callers), we can safely advertise it for use in the git-config manual. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:44 -08:00
Taylor Blau	c97733435a	builtin/pack-objects.c: respect 'pack.writeReverseIndex' Now that we have an implementation that can write the new reverse index format, enable writing a .rev file in 'git pack-objects' by consulting the pack.writeReverseIndex configuration variable. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Taylor Blau	e37d0b8730	builtin/index-pack.c: write reverse indexes Teach 'git index-pack' to optionally write and verify reverse index with '--[no-]rev-index', as well as respecting the 'pack.writeReverseIndex' configuration option. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Taylor Blau	84d544943c	builtin/index-pack.c: allow stripping arbitrary extensions To derive the filename for a .idx file, 'git index-pack' uses derive_filename() to strip the '.pack' suffix and add the new suffix. Prepare for stripping off suffixes other than '.pack' by making the suffix to strip a parameter of derive_filename(). In order to make this consistent with the "suffix" parameter which does not begin with a ".", an additional check in derive_filename. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Taylor Blau	8ef50d9958	pack-write.c: prepare to write 'pack-*.rev' files This patch prepares for callers to be able to write reverse index files to disk. It adds the necessary machinery to write a format-compliant .rev file from within 'write_rev_file()', which is called from 'finish_tmp_packfile()'. Similar to the process by which the reverse index is computed in memory, these new paths also have to sort a list of objects by their offsets within a packfile. These new paths use a qsort() (as opposed to a radix sort), since our specialized radix sort requires a full revindex_entry struct per object, which is more memory than we need to allocate. The qsort is obviously slower, but the theoretical slowdown would require a repository with a large amount of objects, likely implying that the time spent in, say, pack-objects during a repack would dominate the overall runtime. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Taylor Blau	2f4ba2a867	packfile: prepare for the existence of '.rev' files Specify the format of the on-disk reverse index 'pack-.rev' file, as well as prepare the code for the existence of such files. The reverse index maps from pack relative positions (i.e., an index into the array of object which is sorted by their offsets within the packfile) to their position within the 'pack-.idx' file. Today, this is done by building up a list of (off_t, uint32_t) tuples for each object (the off_t corresponding to that object's offset, and the uint32_t corresponding to its position in the index). To convert between pack and index position quickly, this array of tuples is radix sorted based on its offset. This has two major drawbacks: First, the in-memory cost scales linearly with the number of objects in a pack. Each 'struct revindex_entry' is sizeof(off_t) + sizeof(uint32_t) + padding bytes for a total of 16. To observe this, force Git to load the reverse index by, for e.g., running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking for a single object in a fresh clone of the kernel, Git needs to allocate 120+ MB of memory in order to hold the reverse index in memory. Second, the cost to sort also scales with the size of the pack. Luckily, this is a linear function since 'load_pack_revindex()' uses a radix sort, but this cost still must be paid once per pack per process. As an example, it takes ~60x longer to print the _size_ of an object as it does to print that entire object's _contents_: Benchmark #1: git.compile cat-file --batch <obj Time (mean ± σ): 3.4 ms ± 0.1 ms [User: 3.3 ms, System: 2.1 ms] Range (min … max): 3.2 ms … 3.7 ms 726 runs Benchmark #2: git.compile cat-file --batch-check="%(objectsize:disk)" <obj Time (mean ± σ): 210.3 ms ± 8.9 ms [User: 188.2 ms, System: 23.2 ms] Range (min … max): 193.7 ms … 224.4 ms 13 runs Instead, avoid computing and sorting the revindex once per process by writing it to a file when the pack itself is generated. The format is relatively straightforward. It contains an array of uint32_t's, the length of which is equal to the number of objects in the pack. The ith entry in this table contains the index position of the ith object in the pack, where "ith object in the pack" is determined by pack offset. One thing that the on-disk format does _not_ contain is the full (up to) eight-byte offset corresponding to each object. This is something that the in-memory revindex contains (it stores an off_t in 'struct revindex_entry' along with the same uint32_t that the on-disk format has). Omit it in the on-disk format, since knowing the index position for some object is sufficient to get a constant-time lookup in the pack-.idx file to ask for an object's offset within the pack. This trades off between the on-disk size of the 'pack-.rev' file for runtime to chase down the offset for some object. Even though the lookup is constant time, the constant is heavier, since it can potentially involve two pointer walks in v2 indexes (one to access the 4-byte offset table, and potentially a second to access the double wide offset table). Consider trying to map an object's pack offset to a relative position within that pack. In a cold-cache scenario, more page faults occur while switching between binary searching through the reverse index and searching through the .idx file for an object's offset. Sure enough, with a cold cache (writing '3' into '/proc/sys/vm/drop_caches' after 'sync'ing), printing out the entire object's contents is still marginally faster than printing its size: Benchmark #1: git.compile cat-file --batch-check="%(objectsize:disk)" <obj >/dev/null Time (mean ± σ): 22.6 ms ± 0.5 ms [User: 2.4 ms, System: 7.9 ms] Range (min … max): 21.4 ms … 23.5 ms 41 runs Benchmark #2: git.compile cat-file --batch <obj >/dev/null Time (mean ± σ): 17.2 ms ± 0.7 ms [User: 2.8 ms, System: 5.5 ms] Range (min … max): 15.6 ms … 18.2 ms 45 runs (Numbers taken in the kernel after cheating and using the next patch to generate a reverse index). There are a couple of approaches to improve cold cache performance not pursued here: - We could include the object offsets in the reverse index format. Predictably, this does result in fewer page faults, but it triples the size of the file, while simultaneously duplicating a ton of data already available in the .idx file. (This was the original way I implemented the format, and it did show `--batch-check='%(objectsize:disk)'` winning out against `--batch`.) On the other hand, this increase in size also results in a large block-cache footprint, which could potentially hurt other workloads. - We could store the mapping from pack to index position in more cache-friendly way, like constructing a binary search tree from the table and writing the values in breadth-first order. This would result in much better locality, but the price you pay is trading O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you can no longer directly index the table). So, neither of these approaches are taken here. (Thankfully, the format is versioned, so we are free to pursue these in the future.) But, cold cache performance likely isn't interesting outside of one-off cases like asking for the size of an object directly. In real-world usage, Git is often performing many operations in the revindex (i.e., asking about many objects rather than a single one). The trade-off is worth it, since we will avoid the vast majority of the cost of generating the revindex that the extra pointer chase will look like noise in the following patch's benchmarks. This patch describes the format and prepares callers (like in pack-revindex.c) to be able to read *.rev files once they exist. An implementation of the writer will appear in the next patch, and callers will gradually begin to start using the writer in the patches that follow after that. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Junio C Hamano	e6362826a0	The fourth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 14:19:20 -08:00
Junio C Hamano	b7bb322cba	Merge branch 'ab/mailmap-fixup' Follow-up fixes and improvements to ab/mailmap topic. * ab/mailmap-fixup: t4203: make blame output massaging more robust mailmap doc: use correct environment variable 'GIT_WORK_TREE' t4203: stop losing return codes of git commands test-lib-functions.sh: fix usage for test_commit()	2021-01-25 14:19:20 -08:00
Junio C Hamano	bcaaf972e6	Merge branch 'tb/pack-revindex-api' Abstract accesses to in-core revindex that allows enumerating objects stored in a packfile in the order they appear in the pack, in preparation for introducing an on-disk precomputed revindex. * tb/pack-revindex-api: (21 commits) for_each_object_in_pack(): clarify pack vs index ordering pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()' pack-revindex: hide the definition of 'revindex_entry' pack-revindex: remove unused 'find_revindex_position()' pack-revindex: remove unused 'find_pack_revindex()' builtin/gc.c: guess the size of the revindex for_each_object_in_pack(): convert to new revindex API unpack_entry(): convert to new revindex API packed_object_info(): convert to new revindex API retry_bad_packed_offset(): convert to new revindex API get_delta_base_oid(): convert to new revindex API rebuild_existing_bitmaps(): convert to new revindex API try_partial_reuse(): convert to new revindex API get_size_by_pos(): convert to new revindex API show_objects_for_type(): convert to new revindex API bitmap_position_packfile(): convert to new revindex API check_object(): convert to new revindex API write_reused_pack_verbatim(): convert to new revindex API write_reused_pack_one(): convert to new revindex API write_reuse_object(): convert to new revindex API ...	2021-01-25 14:19:20 -08:00
Junio C Hamano	381dac2349	Merge branch 'ab/coc-update-to-2.0' Update the Code-of-conduct to version 2.0 from the upstream (we've been using version 1.4). * ab/coc-update-to-2.0: CoC: update to version 2.0 + local changes CoC: explicitly take any whitespace breakage CoC: Update word-wrapping to match upstream	2021-01-25 14:19:19 -08:00
Junio C Hamano	294e949fa2	Merge branch 'ps/config-env-pairs' Introduce two new ways to feed configuration variable-value pairs via environment variables, and tweak the way GIT_CONFIG_PARAMETERS encodes variable/value pairs to make it more robust. * ps/config-env-pairs: config: allow specifying config entries via envvar pairs environment: make `getenv_safe()` a public function config: store "git -c" variables using more robust format config: parse more robust format in GIT_CONFIG_PARAMETERS config: extract function to parse config pairs quote: make sq_dequote_step() a public function config: add new way to pass config via `--config-env` git: add `--super-prefix` to usage string	2021-01-25 14:19:19 -08:00
Junio C Hamano	7eefa1349b	Merge branch 'cc/write-promisor-file' A bit of code refactoring. * cc/write-promisor-file: pack-write: die on error in write_promisor_file() fetch-pack: refactor writing promisor file fetch-pack: rename helper to create_promisor_file()	2021-01-25 14:19:19 -08:00
Junio C Hamano	8b48981987	Merge branch 'jx/bundle' "git bundle" learns "--stdin" option to read its refs from the standard input. Also, it now does not lose refs whey they point at the same object. * jx/bundle: bundle: arguments can be read from stdin bundle: lost objects when removing duplicate pendings test: add helper functions for git-bundle	2021-01-25 14:19:19 -08:00
Junio C Hamano	42342b3ee6	Merge branch 'ab/mailmap' Clean-up docs, codepaths and tests around mailmap. * ab/mailmap: (22 commits) shortlog: remove unused(?) "repo-abbrev" feature mailmap doc + tests: document and test for case-insensitivity mailmap tests: add tests for empty "<>" syntax mailmap tests: add tests for whitespace syntax mailmap tests: add a test for comment syntax mailmap doc + tests: add better examples & test them tests: refactor a few tests to use "test_commit --append" test-lib functions: add an --append option to test_commit test-lib functions: add --author support to test_commit test-lib functions: document arguments to test_commit test-lib functions: expand "test_commit" comment template mailmap: test for silent exiting on missing file/blob mailmap tests: get rid of overly complex blame fuzzing mailmap tests: add a test for "not a blob" error mailmap tests: remove redundant entry in test mailmap tests: improve --stdin tests mailmap tests: modernize syntax & test idioms mailmap tests: use our preferred whitespace syntax mailmap doc: start by mentioning the comment syntax check-mailmap doc: note config options ...	2021-01-25 14:19:19 -08:00
Junio C Hamano	60ecad090d	Merge branch 'ps/fetch-atomic' "git fetch" learns to treat ref updates atomically in all-or-none fashion, just like "git push" does, with the new "--atomic" option. * ps/fetch-atomic: fetch: implement support for atomic reference updates fetch: allow passing a transaction to `s_update_ref()` fetch: refactor `s_update_ref` to use common exit path fetch: use strbuf to format FETCH_HEAD updates fetch: extract writing to FETCH_HEAD	2021-01-25 14:19:19 -08:00
Junio C Hamano	b69bed22c5	Merge branch 'jk/log-cherry-pick-duplicate-patches' When more than one commit with the same patch ID appears on one side, "git log --cherry-pick A...B" did not exclude them all when a commit with the same patch ID appears on the other side. Now it does. * jk/log-cherry-pick-duplicate-patches: patch-ids: handle duplicate hashmap entries	2021-01-25 14:19:19 -08:00
Junio C Hamano	27d7c8599b	Merge branch 'js/default-branch-name-tests-final-stretch' Prepare tests not to be affected by the name of the default branch "git init" creates. * js/default-branch-name-tests-final-stretch: (28 commits) tests: drop prereq `PREPARE_FOR_MAIN_BRANCH` where no longer needed t99: adjust the references to the default branch name "main" tests(git-p4): transition to the default branch name `main` t9[5-7]: adjust the references to the default branch name "main" t9[0-4]: adjust the references to the default branch name "main" t8: adjust the references to the default branch name "main" t7[5-9]: adjust the references to the default branch name "main" t7[0-4]: adjust the references to the default branch name "main" t6[4-9]: adjust the references to the default branch name "main" t64: preemptively adjust alignment to prepare for `master` -> `main` t6[0-3]: adjust the references to the default branch name "main" t5[6-9]: adjust the references to the default branch name "main" t55[4-9]: adjust the references to the default branch name "main" t55[23]: adjust the references to the default branch name "main" t551: adjust the references to the default branch name "main" t550: adjust the references to the default branch name "main" t5503: prepare aligned comment for replacing `master` with `main` t5[0-4]: adjust the references to the default branch name "main" t5323: prepare centered comment for `master` -> `main` t4: adjust the references to the default branch name "main" ...	2021-01-25 14:19:18 -08:00
Junio C Hamano	440acfbe0c	Merge branch 'dl/reflog-with-single-entry' After expiring a reflog and making a single commit, the reflog for the branch would record a single entry that knows both @{0} and @{1}, but we failed to answer "what commit were we on?", i.e. @{1} * dl/reflog-with-single-entry: refs: allow @{n} to work with n-sized reflog refs: factor out set_read_ref_cutoffs()	2021-01-25 14:19:18 -08:00
Junio C Hamano	0806279428	Merge branch 'sj/untracked-files-in-submodule-directory-is-not-dirty' "git diff" showed a submodule working tree with untracked cruft as "Submodule commit <objectname>-dirty", but a natural expectation is that the "-dirty" indicator would align with "git describe --dirty", which does not consider having untracked files in the working tree as source of dirtiness. The inconsistency has been fixed. * sj/untracked-files-in-submodule-directory-is-not-dirty: diff: do not show submodule with untracked files as "-dirty"	2021-01-25 14:19:18 -08:00
Junio C Hamano	dfcd905069	Merge branch 'jc/deprecate-pack-redundant' Warn loudly when the "pack-redundant" command, which has been left stale with almost unusable performance issues, gets used, as we no longer want to recommend its use (instead just "repack -d" instead). * jc/deprecate-pack-redundant: pack-redundant: gauge the usage before proposing its removal	2021-01-25 14:19:18 -08:00
Junio C Hamano	c7b1aaf6d6	Merge branch 'jk/forbid-lf-in-git-url' Newline characters in the host and path part of git:// URL are now forbidden. * jk/forbid-lf-in-git-url: fsck: reject .gitmodules git:// urls with newlines git_connect_git(): forbid newlines in host and path	2021-01-25 14:19:17 -08:00
Junio C Hamano	9e409d7e07	Merge branch 'ab/branch-sort' The implementation of "git branch --sort" wrt the detached HEAD display has always been hacky, which has been cleaned up. * ab/branch-sort: branch: show "HEAD detached" first under reverse sort branch: sort detached HEAD based on a flag ref-filter: move ref_sorting flags to a bitfield ref-filter: move "cmp_fn" assignment into "else if" arm ref-filter: add braces to if/else if/else chain branch tests: add to --sort tests branch: change "--local" to "--list" in comment	2021-01-25 14:19:17 -08:00
Junio C Hamano	a5ac31b5b1	Merge branch 'en/diffcore-rename' File-level rename detection updates. * en/diffcore-rename: diffcore-rename: remove unnecessary duplicate entry checks diffcore-rename: accelerate rename_dst setup diffcore-rename: simplify and accelerate register_rename_src() t4058: explore duplicate tree entry handling in a bit more detail t4058: add more tests and documentation for duplicate tree entry handling diffcore-rename: reduce jumpiness in progress counters diffcore-rename: simplify limit check diffcore-rename: avoid usage of global in too_many_rename_candidates() diffcore-rename: rename num_create to num_destinations	2021-01-25 14:19:17 -08:00
Junio C Hamano	58e2ce9112	Merge branch 'ma/more-opaque-lock-file' Code clean-up. * ma/more-opaque-lock-file: read-cache: try not to peek into `struct {lock_,temp}file` refs/files-backend: don't peek into `struct lock_file` midx: don't peek into `struct lock_file` commit-graph: don't peek into `struct lock_file` builtin/gc: don't peek into `struct lock_file`	2021-01-25 14:19:17 -08:00
Junio C Hamano	2856089e36	Merge branch 'en/merge-ort-3' Rename detection is added to the "ORT" merge strategy. * en/merge-ort-3: merge-ort: add implementation of type-changed rename handling merge-ort: add implementation of normal rename handling merge-ort: add implementation of rename collisions merge-ort: add implementation of rename/delete conflicts merge-ort: add implementation of both sides renaming differently merge-ort: add implementation of both sides renaming identically merge-ort: add basic outline for process_renames() merge-ort: implement compare_pairs() and collect_renames() merge-ort: implement detect_regular_renames() merge-ort: add initial outline for basic rename detection merge-ort: add basic data structures for handling renames	2021-01-25 14:19:17 -08:00
Junio C Hamano	c7d6d419b0	Merge branch 'ab/mktag' "git mktag" validates its input using its own rules before writing a tag object---it has been updated to share the logic with "git fsck". * ab/mktag: (23 commits) mktag: add a --[no-]strict option mktag: mark strings for translation mktag: convert to parse-options mktag: allow omitting the header/body \n separator mktag: allow turning off fsck.extraHeaderEntry fsck: make fsck_config() re-usable mktag: use fsck instead of custom verify_tag() mktag: use puts(str) instead of printf("%s\n", str) mktag: remove redundant braces in one-line body "if" mktag: use default strbuf_read() hint mktag tests: test verify_object() with replaced objects mktag tests: improve verify_object() test coverage mktag tests: test "hash-object" compatibility mktag tests: stress test whitespace handling mktag tests: run "fsck" after creating "mytag" mktag tests: don't create "mytag" twice mktag tests: don't redirect stderr to a file needlessly mktag tests: remove needless SHA-1 hardcoding mktag tests: use "test_commit" helper mktag tests: don't needlessly use a subshell ...	2021-01-25 14:19:17 -08:00
Ævar Arnfjörð Bjarmason	95ca1f987e	grep/pcre2: better support invalid UTF-8 haystacks Improve the support for invalid UTF-8 haystacks given a non-ASCII needle when using the PCREv2 backend. This is a more complete fix for a bug I started to fix in `870eea8166` (grep: do not enter PCRE2_UTF mode on fixed matching, 2019-07-26), now that PCREv2 has the PCRE2_MATCH_INVALID_UTF mode we can make use of it. This fixes the sort of case described in `8a5999838e` (grep: stess test PCRE v2 on invalid UTF-8 data, 2019-07-26), i.e.: - The subject string is non-ASCII (e.g. "ævar") - We're under a is_utf8_locale(), e.g. "en_US.UTF-8", not "C" - We are using --ignore-case, or we're a non-fixed pattern If those conditions were satisfied and we matched found non-valid UTF-8 data PCREv2 might bark on it, in practice this only happened under the JIT backend (turned on by default on most platforms). Ultimately this fixes a "regression" in `b65abcafc7` ("grep: use PCRE v2 for optimized fixed-string search", 2019-07-01), I'm putting that in scare-quotes because before then we wouldn't properly support these complex case-folding, locale etc. cases either, it just broke in different ways. There was a bug related to this the PCRE2_NO_START_OPTIMIZE flag fixed in PCREv2 10.36. It can be worked around by setting the PCRE2_NO_START_OPTIMIZE flag. Let's do that in those cases, and add tests for the bug. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-24 16:09:17 -08:00
Ævar Arnfjörð Bjarmason	a4fea08b6e	grep/pcre2 tests: don't rely on invalid UTF-8 data test As noted in [1] when I originally added this test in [2] the test was completely broken as it lacked a redirect[3]. I now think this whole thing is overly fragile. Let's only test if we have a segfault here. Before this the first test's "test_cmp" was pretty meaningless. We were only testing if PCREv2 was so broken that it would spew out something completely unrelated on stdout, which isn't very plausible. In the second test we're relying on PCREv2 forever holding to the current behavior of the PCRE_UTF8 flag, as opposed to learning some optimistic graceful fallback to PCRE2_MATCH_INVALID_UTF in the future. If that happens having this test broken under bisecting would suck. A follow-up commit will actually test this case in a meaningful way under the PCRE2_MATCH_INVALID_UTF flag. Let's run this one unconditionally, and just make sure we don't segfault. 1. `e714b898c6` (t7812: expect failure for grep -i with invalid UTF-8 data, 2019-11-29) 2. `8a5999838e` (grep: stess test PCRE v2 on invalid UTF-8 data, 2019-07-26) 3. `c74b3cbb83` (t7812: add missing redirects, 2019-11-26) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-24 16:09:15 -08:00
Elijah Newren	557ac0350d	merge-ort: begin performance work; instrument with trace2_region_* calls Add some timing instrumentation for both merge-ort and diffcore-rename; I used these to measure and optimize performance in both, and several future patch series will build on these to reduce the timings of some select testcases. === Setup === The primary testcase I used involved rebasing a random topic in the linux kernel (consisting of 35 patches) against an older version. I added two variants, one where I rename a toplevel directory, and another where I only rebase one patch instead of the whole topic. The setup is as follows: $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git $ git branch hwmon-updates fd8bdb23b91876ac1e624337bb88dc1dcc21d67e $ git branch hwmon-just-one fd8bdb23b91876ac1e624337bb88dc1dcc21d67e~34 $ git branch base 4703d9119972bf586d2cca76ec6438f819ffa30e $ git switch -c 5.4-renames v5.4 $ git mv drivers pilots # Introduce over 26,000 renames $ git commit -m "Rename drivers/ to pilots/" $ git config merge.renameLimit 30000 $ git config merge.directoryRenames true === Testcases === Now with REBASE standing for either "git rebase [--merge]" (using merge-recursive) or "test-tool fast-rebase" (using merge-ort), the testcases are: Testcase #1: no-renames $ git checkout v5.4^0 $ REBASE --onto HEAD base hwmon-updates Note: technically the name is misleading; there are some renames, but very few. Rename detection only takes about half the overall time. Testcase #2: mega-renames $ git checkout 5.4-renames^0 $ REBASE --onto HEAD base hwmon-updates Testcase #3: just-one-mega $ git checkout 5.4-renames^0 $ REBASE --onto HEAD base hwmon-just-one === Timing results === Overall timings, using hyperfine (1 warmup run, 3 runs for mega-renames, 10 runs for the other two cases): merge-recursive merge-ort no-renames: 18.912 s ± 0.174 s 14.263 s ± 0.053 s mega-renames: 5964.031 s ± 10.459 s 5504.231 s ± 5.150 s just-one-mega: 149.583 s ± 0.751 s 158.534 s ± 0.498 s A single re-run of each with some breakdowns: --- no-renames --- merge-recursive merge-ort overall runtime: 19.302 s 14.257 s inexact rename detection: 7.603 s 7.906 s everything else: 11.699 s 6.351 s --- mega-renames --- merge-recursive merge-ort overall runtime: 5950.195 s 5499.672 s inexact rename detection: 5746.309 s 5487.120 s everything else: 203.886 s 17.552 s --- just-one-mega --- merge-recursive merge-ort overall runtime: 151.001 s 158.582 s inexact rename detection: 143.448 s 157.835 s everything else: 7.553 s 0.747 s === Timing observations === 0) Maximum speedup The "everything else" row represents the maximum speedup we could achieve if we were to somehow infinitely parallelize inexact rename detection, but leave everything else alone. The fact that this is so much smaller than the real runtime (even in the case with virtually no renames) makes it clear just how overwhelmingly large the time spent on rename detection can be. 1) no-renames 1a) merge-ort is faster than merge-recursive, which is nice. However, this still should not be considered good enough. Although the "merge" backend to rebase (merge-recursive) is sometimes faster than the "apply" backend, this is one of those cases where it is not. In fact, even merge-ort is slower. The "apply" backend can complete this testcase in 6.940 s ± 0.485 s which is about 2x faster than merge-ort and 3x faster than merge-recursive. One goal of the merge-ort performance work will be to make it faster than git-am on this (and similar) testcases. 2) mega-renames 2a) Obviously rename detection is a huge cost; it's where most the time is spent. We need to cut that down. If we could somehow infinitely parallelize it and drive its time to 0, the merge-recursive time would drop to about 204s, and the merge-ort time would drop to about 17s. I think this particular stat shows I've subtly baked a couple performance improvements into merge-ort and into fast-rebase already. 3) just-one-mega 3a) not much to say here, it just gives some flavor for how rebasing only one patch compares to rebasing 35. === Goals === This patch is obviously just the beginning. Here are some of my goals that this measurement will help us achieve: * Drive the cost of rename detection down considerably for merges * After the above has been achieved, see if there are other slowness factors (which would have previously been overshadowed by rename detection costs) which we can then focus on and also optimize. * Ensure our rebase testcase that requires little rename detection is noticeably faster with merge-ort than with apply-based rebase. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Taylor Blau <ttaylorr@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 23:30:06 -08:00
Elijah Newren	5ced7c3da0	merge-ort: ignore the directory rename split conflict for now get_provisional_directory_renames() has code to detect directories being evenly split between different locations. However, as noted previously, if there are no new files added to that directory that was split evenly, our inability to determine where the directory was renamed to doesn't matter since there are no new files to try to move into the new location. Unfortunately, that code is unaware of whether there are new files under the directory in question and we just ignore that, causing us to fail t6423 test 2b but pass test 2a; turn off the error for now, swapping which tests pass and fail. The motivating reason for switching this off as a temporary measure is that as we add optimizations, we'll start looking at only subsets of renames, and subsets of renames can start switching the result we get when this error is (wrongly) on. Once we get enough optimizations, however, we can prevent that code from even running when there are no new files added to the relevant directory, at which point we can revert this commit and then both testcases 2a and 2b will pass simultaneously. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 23:30:06 -08:00
Elijah Newren	cf8937acde	merge-ort: fix massive leak When a series of merges was performed (such as for a rebase or series of cherry-picks), only the data structures allocated by the final merge operation were being freed. The problem was that while picking out pieces of merge-ort to upstream, I previously misread a certain section of merge_start() and assumed it was associated with a later optimization. Include that section now, which ensures that if there was a previous merge operation, that we clear out result->priv and then re-use it for opt->priv, and otherwise we allocate opt->priv. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 23:30:06 -08:00
Ævar Arnfjörð Bjarmason	7599730b7e	Remove support for v1 of the PCRE library Remove support for using version 1 of the PCRE library. Its use has been discouraged by upstream for a long time, and it's in a bugfix-only state. Anyone who was relying on v1 in particular got a nudge to move to v2 in `e6c531b808` (Makefile: make USE_LIBPCRE=YesPlease mean v2, not v1, 2018-03-11), which was first released as part of v2.18.0. With this the LIBPCRE2 test prerequisites is redundant to PCRE. But I'm keeping it for self-documentation purposes, and to avoid conflict with other in-flight PCRE patches. I'm also not changing all of our own "pcre2" names to "pcre", i.e. the inverse of `6d4b5747f0` (grep: change internal pcre variable & function names to be pcre1, 2017-05-25). I don't see the point, and it makes the history/blame harder to read. Maybe if there's ever a PCRE v3... Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 21:15:43 -08:00
Ævar Arnfjörð Bjarmason	0205bb13d0	config.mak.uname: remove redundant NO_LIBPCRE1_JIT flag Remove a flag added in my `fb95e2e38d` (grep: un-break building with PCRE >= 8.32 without --enable-jit, 2017-06-01). It's set just below USE_LIBPCRE=YesPlease, so it's been redundant since `e6c531b808` (Makefile: make USE_LIBPCRE=YesPlease mean v2, not v1, 2018-03-11). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 21:15:12 -08:00
Derrick Stolee	19a0acc83e	t1092: test interesting sparse-checkout scenarios These also document some behaviors that differ from a full checkout, and possibly in a way that is not intended. The test is designed to be run with "--run=1,X" where 'X' is an interesting test case. Each test uses 'init_repos' to reset the full and sparse copies of the initial-repo that is created by the first test case. This also makes it possible to have test cases leave the working directory or index in unusual states without disturbing later cases. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:20 -08:00
Derrick Stolee	3b14436364	test-lib: test_region looks for trace2 regions From ff15d509b89edd4830d85d53cea3079a6b0c1c08 Mon Sep 17 00:00:00 2001 From: Derrick Stolee <dstolee@microsoft.com> Date: Mon, 11 Jan 2021 08:53:09 -0500 Subject: [PATCH 8/9] test-lib: test_region looks for trace2 regions Most test cases can verify Git's behavior using input/output expectations or changes to the .git directory. However, sometimes we want to check that Git did or did not run a certain section of code. This is particularly important for performance-only features that we want to ensure have been enabled in certain cases. Add a new 'test_region' function that checks if a trace2 region was entered and left in a given trace2 event log. There is one existing test (t0500-progress-display.sh) that performs this check already, so use the helper function instead. Note that this changes the expectations slightly. The old test (incorrectly) used two patterns for the 'grep' invocation, but this performs an OR of the patterns, not an AND. This means that as long as one region_enter event was logged, the test would succeed, even if it was not due to the progress category. More uses will be added in a later change. t6423-merge-rename-directories.sh also greps for region_enter lines, but it verifies the number of such lines, which is not the same as an existence check. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:18 -08:00
Derrick Stolee	dd23022acb	sparse-checkout: load sparse-checkout patterns A future feature will want to load the sparse-checkout patterns into a pattern_list, but the current mechanism to do so is a bit complicated. This is made difficult due to needing to find the sparse-checkout file in different ways throughout the codebase. The logic implemented in the new get_sparse_checkout_patterns() was duplicated in populate_from_existing_patterns() in unpack-trees.c. Use the new method instead, keeping the logic around handling the struct unpack_trees_options. The callers to get_sparse_checkout_filename() in builtin/sparse-checkout.c manipulate the sparse-checkout file directly, so it is not appropriate to replace logic in that file with get_sparse_checkout_patterns(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:07 -08:00
Derrick Stolee	6a9372f4ef	name-hash: use trace2 regions for init The lazy_init_name_hash() populates a hashset with all filenames and another with all directories represented in the index. This is run only if we need to use the hashsets to check for existence or case-folding renames. Place trace2 regions where there is already a performance trace. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:07 -08:00
Derrick Stolee	1fd9ae517c	repository: add repo reference to index_state It will be helpful to add behavior to index operations that might trigger an object lookup. Since each index belongs to a specific repository, add a 'repo' pointer to struct index_state that allows access to this repository. Add a BUG() statement if the repo already has an index, and the index already has a repo, but somehow the index points to a different repo. This will prevent future changes from needing to pass an additional 'struct repository repo' parameter and instead rely only on the 'struct index_state istate' parameter. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:07 -08:00
Derrick Stolee	cae70acf24	fsmonitor: de-duplicate BUG()s around dirty bits The index has an fsmonitor_dirty bitmap that records which index entries are "dirty" based on the response from the FSMonitor. If this bitmap ever grows larger than the index, then there was an error in how it was constructed, and it was probably a developer's bug. There are several BUG() statements that are very similar, so replace these uses with a simpler assert_index_minimum(). Since there is one caller that uses a custom 'pos' value instead of the bit_size member, we cannot simplify it too much. However, the error string is identical in each, so this simplifies things. Be sure to add one when checking if a position if valid, since the minimum is a bound on the expected size. The end result is that the code is simpler to read while also preserving these assertions for developers in the FSMonitor space. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:07 -08:00
Derrick Stolee	c80dd3967f	cache-tree: extract subtree_pos() This method will be helpful to use outside of cache-tree.c in a later feature. The implementation is subtle due to subtree_name_cmp() sorting by length and then lexicographically. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:07 -08:00
Derrick Stolee	8d87e338e1	cache-tree: simplify verify_cache() prototype The verify_cache() method takes an array of cache entries and a count, but these are always provided directly from a struct index_state. Use a pointer to the full structure instead. There is a subtle point when istate->cache_nr is zero that subtracting one will underflow. This triggers a failure in t0000-basic.sh, among others. Use "i + 1 < istate->cache_nr" to avoid these strange comparisons. Convert i to be unsigned as well, which also removes the potential signed overflow in the unlikely case that cache_nr is over 2.1 billion entries. The 'funny' variable has a maximum value of 11, so making it unsigned does not change anything of importance. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:07 -08:00
Derrick Stolee	fb0882648e	cache-tree: clean up cache_tree_update() Make the method safer by allocating a cache_tree member for the given index_state if it is not already present. This is preferrable to a BUG() statement or returning with an error because future callers will want to populate an empty cache-tree using this method. Callers can also remove their conditional allocations of cache_tree. Also drop local variables that can be found directly from the 'istate' parameter. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:07 -08:00
Ævar Arnfjörð Bjarmason	db89a82b5b	rm tests: actually test for SIGPIPE in SIGPIPE test Change a test initially added in `50cd31c652` (t3600: comment on inducing SIGPIPE in `git rm`, 2019-11-27) to explicitly test for SIGPIPE using a pattern initially established in `7559a1be8a` (unblock and unignore SIGPIPE, 2014-09-18). The problem with using that pattern is that it requires us to skip the test on MINGW[1]. If we kept the test with its initial semantics[2] we'd get coverage there, at the cost of not checking whether we actually had SIGPIPE outside of MinGW. Arguably we should just remove this test. Between the test added in `7559a1be8a` and the change made in `12e0437f23` (common-main: call restore_sigpipe_to_default(), 2016-07-01) it's a bit arbitrary to only check this for "git rm". But in lieu of having wider test coverage for other "git" subcommands let's refactor this to explicitly test for SIGPIPE outside of MinGW, and then just that we remove the ".git/index.lock" (as before) on all platforms. 1. https://lore.kernel.org/git/xmqq1rec5ckf.fsf@gitster.c.googlers.com/ 2. `0693f9ddad` (Make sure lockfiles are unlocked when dying on SIGPIPE, 2008-12-18) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason	60127996b5	archive tests: use a cheaper "zipinfo -h" invocation to get header Change an invocation of zipinfo added in `19ee29401d` (t5004: test ZIP archives with many entries, 2015-08-22) to simply ask zipinfo for the header info, rather than spewing out info about the entire archive and race to kill it with SIGPIPE due to the downstream "head -2". I ran across this because I'm adding a "set -o pipefail" test mode. This won't be needed for the version of the mode that I'm introducing (which currently relies on a patch to GNU bash), but I think this is a good idea anyway. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason	9aebc4708a	upload-pack tests: avoid a non-zero "grep" exit status Continue changing a test that `763b47bafa` (t5703: stop losing return codes of git commands, 2019-11-27) already refactored. This was originally added as part of a series to add support for running under bash's "set -o pipefail", under that mode this test will fail because sometimes there's no commits in the "objs" output. It's easier to fix that than exempt these tests under a hypothetical "set -o pipefail" test mode. It looks like we probably won't have that, but once we've dug this code up let's refactor it[2] so we don't hide a potential pipe failure. 1. https://lore.kernel.org/git/xmqqzh18o8o6.fsf@gitster.c.googlers.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:12 -08:00
Jeff King	796c248dc1	git-svn tests: rewrite brittle tests to use "--[no-]merges". Rewrite a brittle tests which used "rev-list" without "--[no-]merges" to figure out if a set of commits turned into merge commits or not. Signed-off-by: Jeff King <peff@peff.net> [ÆAB: wrote commit message] Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason	f918a89e50	git svn mergeinfo tests: refactor "test -z" to use test_must_be_empty Refactor some old-style test code to use test_must_be_empty instead of "test -z". This makes a follow-up commit easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason	4669917e8f	git svn mergeinfo tests: modernize redirection & quoting style Use "<file" instead of "< file", and don't put the closing quote for strings on an indented line. This makes a follow-up refactoring commit easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason	ef83970059	cache-tree tests: explicitly test HEAD and index differences The test code added in `9c4d6c0297` (cache-tree: Write updated cache-tree after commit, 2014-07-13) used "ls-files" in lieu of "ls-tree" because it wanted to test the data in the index, since this test is testing the cache-tree extension. Change the test to instead use "ls-tree" for traversal, and then explicitly check how HEAD differs from the index. This is more easily understood, and less fragile as numerous past bug fixes[1][2][3] to the old code we're replacing demonstrate. As an aside this would be a bit easier if empty pathspecs hadn't been made an error in `d426430e6e` (pathspec: warn on empty strings as pathspec, 2016-06-22) and `9e4e8a64c2` (pathspec: die on empty strings as pathspec, 2017-06-06). If that was still allowed this code could be simplified slightly: diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 9bf66c9e68..0b02881f55 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -18,19 +18,18 @@ cmp_cache_tree () { # test-tool dump-cache-tree already verifies that all existing data is # correct. generate_expected_cache_tree () { - pathspec="$1" && - dir="$2${2:+/}" && + pathspec="$1${1:+/}" && git ls-tree --name-only HEAD -- "$pathspec" >files && git ls-tree --name-only -d HEAD -- "$pathspec" >subtrees && - printf "SHA %s (%d entries, %d subtrees)\n" "$dir" $(wc -l <files) $(wc -l <subtrees) && + printf "SHA %s (%d entries, %d subtrees)\n" "$pathspec" $(wc -l <files) $(wc -l <subtrees) && while read subtree do - generate_expected_cache_tree "$pathspec/$subtree/" "$subtree" \|\| return 1 + generate_expected_cache_tree "$subtree" \|\| return 1 done <subtrees } test_cache_tree () { - generate_expected_cache_tree "." >expect && + generate_expected_cache_tree >expect && cmp_cache_tree expect && rm expect actual files subtrees && git status --porcelain -- ':!status' ':!expected.status' >status && 1. `c8db708d5d` (t0090: avoid passing empty string to printf %d, 2014-09-30) 2. `d69360c6b1` (t0090: tweak awk statement for Solaris /usr/xpg4/bin/awk, 2014-12-22) 3. `9b5a9fa60a` (t0090: stop losing return codes of git commands, 2019-11-27) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason	fa6edee776	cache-tree tests: use a sub-shell with less indirection Change a "cd xyz && work && cd .." pattern introduced in `9c4d6c0297` (cache-tree: Write updated cache-tree after commit, 2014-07-13) to use a sub-shell instead with less indirection. We did actually recover correctly if we failed in this function since we were wrapped in a subshell one function call up. Let's just use the sub-shell at the point where we want to change the directory instead. It's important that the "\|\| return 1" is outside the subshell. Normally, we `exit 1` from within subshells[1], but that wouldn't help us exit this loop early[1][2]. Since we can get rid of the wrapper function let's rename the main function to drop the "rec" (for "recursion") suffix[3]. 1. https://lore.kernel.org/git/CAPig+cToj8nQmyBCqC1k7DXF2vXaonCEA-fCJ4x7JBZG2ixYBw@mail.gmail.com/ 2. https://lore.kernel.org/git/20150325052952.GE31924@peff.net/ 3. https://lore.kernel.org/git/YARsCsgXuiXr4uFX@coredump.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason	3226725507	cache-tree tests: remove unused $2 parameter Remove the $2 paramater. This appears to have been some work-in-progress code from an earlier version of `9c4d6c0297` (cache-tree: Write updated cache-tree after commit, 2014-07-13) which was left in the final version. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason	3f96d75ef5	cache-tree tests: refactor for modern test style Refactor the cache-tree test file to use our current recommended patterns. This makes a subsequent meaningful change easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 13:25:11 -08:00
ZheNing Hu	93a7d9835f	ls-files.c: add --deduplicate option During a merge conflict, the name of a file may appear multiple times in "git ls-files" output, once for each stage. If you use both `--delete` and `--modify` at the same time, the output may mention a deleted file twice. When none of the '-t', '-u', or '-s' options is in use, these duplicate entries do not add much value to the output. Introduce a new '--deduplicate' option to suppress them. Signed-off-by: ZheNing Hu <adlternative@gmail.com> [jc: extended doc and rewritten commit log] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 11:48:20 -08:00
ZheNing Hu	ed644d1666	ls_files.c: consolidate two for loops into one This will make it easier to show only one entry per filename in the next step. Signed-off-by: ZheNing Hu <adlternative@gmail.com> [jc: corrected the log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 11:48:20 -08:00
ZheNing Hu	f1c462ea41	ls_files.c: bugfix for --deleted and --modified This situation may occur in the original code: lstat() failed but we use `&st` to feed ie_modified() later. Therefore, we can directly execute show_ce without the judgment of ie_modified() when lstat() has failed. Signed-off-by: ZheNing Hu <adlternative@gmail.com> [jc: fixed misindented code] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 11:48:11 -08:00
Taylor Blau	b3970c702c	ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets ls-refs performs a single revision walk over the whole ref namespace, and sends ones that match with one of the given ref prefixes down to the user. This can be expensive if there are many refs overall, but the portion of them covered by the given prefixes is small by comparison. To attempt to reduce the difference between the number of refs traversed, and the number of refs sent, only traverse references which are in the longest common prefix of the given prefixes. This is very reminiscent of the approach taken in `b31e2680c4` (ref-filter.c: find disjoint pattern prefixes, 2019-06-26) which does an analogous thing for multi-patterned 'git for-each-ref' invocations. The callback 'send_ref' is resilient to ignore extra patterns by discarding any arguments which do not begin with at least one of the specified prefixes. Similarly, the code introduced in `b31e2680c4` is resilient to stop early at metacharacters, but we only pass strict prefixes here. At worst we would return too many results, but the double checking done by send_ref will throw away anything that doesn't start with something in the prefix list. Finally, if no prefixes were provided, then implicitly add the empty string (which will match all references) since this matches the existing behavior (see the "no restrictions" comment in "ls-refs.c:ref_match()"). Original-patch-by: Jacob Vosmaer <jacob@gitlab.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-22 18:57:27 -08:00
Jacob Vosmaer	83befd3724	ls-refs.c: initialize 'prefixes' before using it Correctly initialize the "prefixes" strvec using strvec_init() instead of simply zeroing it via the earlier memset(). There's no way to trigger a crash, since the first 'ref-prefix' command will initialize the strvec via the 'ALLOC_GROW' in 'strvec_push_nodup()' (the alloc and nr variables are already zero'd, so the call to ALLOC_GROW is valid). If no "ref-prefix" command was given, then the call to 'ls-refs.c:ref_match()' will abort early after it reads the zero in 'prefixes->nr'. Likewise, strvec_clear() will only call free() on the array, which is NULL, so we're safe there, too. But, all of this is dangerous and requires more reasoning than it would if we simply called 'strvec_init()', so do that. Signed-off-by: Jacob Vosmaer <jacob@gitlab.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-22 18:57:27 -08:00
Taylor Blau	16b1985be5	refs: expose 'for_each_fullref_in_prefixes' This function was used in the ref-filter.c code to find the longest common prefix of among a set of refspecs, and then to iterate all of the references that descend from that prefix. A future patch will want to use that same code from ls-refs.c, so prepare by exposing and moving it to refs.c. Since there is nothing specific to the ref-filter code here (other than that it was previously the only caller of this function), this really belongs in the more generic refs.h header. The code moved in this patch is identical before and after, with the one exception of renaming some arguments to be consistent with other functions exposed in refs.h. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-22 18:57:27 -08:00
Jacob Vosmaer	be18153b97	builtin/pack-objects.c: avoid iterating all refs In git-pack-objects, we iterate over all the tags if the --include-tag option is passed on the command line. For some reason this uses for_each_ref which is expensive if the repo has many refs. We should use for_each_tag_ref instead. Because the add_ref_tag callback will now only visit tags we simplified it a bit. The motivation for this change is that we observed performance issues with a repository on gitlab.com that has 500,000 refs but only 2,000 tags. The fetch traffic on that repo is dominated by CI, and when we changed CI to fetch with 'git fetch --no-tags' we saw a dramatic change in the CPU profile of git-pack-objects. This lead us to this particular ref walk. More details in: https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/746#note_483546598 Signed-off-by: Jacob Vosmaer <jacob@gitlab.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-22 17:27:42 -08:00
Jeff King	ee4e22554f	run-command: document use_shell option It's unclear how run-command's use_shell option should impact the arguments fed to a command. Plausibly it could mean that we glue all of the arguments together into a string to pass to the shell, in which case that opens the question of whether the caller needs to quote them. But in fact we don't implement it that way (and even if we did, we'd probably auto-quote the arguments as part of the glue step). And we must not receive quoted arguments, because we might actually optimize out the shell entirely (i.e., the caller does not even know if a shell will be involved in the end or not). Since this ambiguity may have been the cause of a recent bug, let's document the option a bit. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-22 14:21:32 -08:00
Jiang Xin	822ee894f6	t5411: refactor check of refs using test_cmp_refs Add new helper 'test_cmp_refs' to check references in a repository. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-22 13:09:06 -08:00
Jiang Xin	8388a64cd1	t5411: use different out file to prevent overwriting SZEDER reported that t5411 failed in Travis CI's s390x environment a couple of times, and could be reproduced with '--stress' test on this specific environment. The test failure messages might look like this: + test_cmp expect actual --- expect 2021-01-17 21:55:23.430750004 +0000 +++ actual 2021-01-17 21:55:23.430750004 +0000 @@ -1 +1 @@ -<COMMIT-A> refs/heads/main +<COMMIT-A> refs/heads/maifatal: the remote end hung up unexpectedly error: last command exited with $?=1 not ok 86 - proc-receive: not support push options (builtin protocol) The file 'actual' is filtered from the file 'out' which contains result of 'git show-ref' command. Due to the error messages from other process is written into the file 'out' accidentally, t5411 failed. SZEDER finds the root cause of this issue: - 'git push' is executed with its standard output and error redirected to the file 'out'. - 'git push' executes 'git receive-pack' internally, which inherits the open file descriptors, so its output and error goes into that same 'out' file. - 'git push' ends without waiting for the close of 'git-receive-pack' for some cases, and the file 'out' is reused for test of 'git show-ref' afterwards. - A mixture of the output of 'git show-ref' abd 'git receive-pack' leads to this issue. The first intuitive reaction to resolve this issue is to remove the file 'out' after use, so that the newly created file 'out' will have a different file descriptor and will not be overwritten by the 'git receive-pack' process. But Johannes pointed out that removing an open file is not possible on Windows. So we use different temporary file names to store the output of 'git push' to solve this issue. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Johannes Sixt <j6t@kdbg.org> Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-22 13:09:04 -08:00
Phil Hord	8198907795	use delete_refs when deleting tags or branches 'git tag -d' accepts one or more tag refs to delete, but each deletion is done by calling `delete_ref` on each argv. This is very slow when removing from packed refs. Use delete_refs instead so all the removals can be done inside a single transaction with a single update. Do the same for 'git branch -d'. Since delete_refs performs all the packed-refs delete operations inside a single transaction, if any of the deletes fail then all them will be skipped. In practice, none of them should fail since we verify the hash of each one before calling delete_refs, but some network error or odd permissions problem could have different results after this change. Also, since the file-backed deletions are not performed in the same transaction, those could succeed even when the packed-refs transaction fails. After deleting branches, remove the branch config only if the branch ref was removed and was not subsequently added back in. A manual test deleting 24,000 tags took about 30 minutes using delete_ref. It takes about 5 seconds using delete_refs. Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-21 16:05:05 -08:00
Jeff King	36a317929b	refs: switch peel_ref() to peel_iterated_oid() The peel_ref() interface is confusing and error-prone: - it's typically used by ref iteration callbacks that have both a refname and oid. But since they pass only the refname, we may load the ref value from the filesystem again. This is inefficient, but also means we are open to a race if somebody simultaneously updates the ref. E.g., this: int some_ref_cb(const char refname, const struct object_id oid, ...) { if (!peel_ref(refname, &peeled)) printf("%s peels to %s", oid_to_hex(oid), oid_to_hex(&peeled); } could print nonsense. It is correct to say "refname peels to..." (you may see the "before" value or the "after" value, either of which is consistent), but mentioning both oids may be mixing before/after values. Worse, whether this is possible depends on whether the optimization to read from the current iterator value kicks in. So it is actually not possible with: for_each_ref(some_ref_cb); but it _is_ possible with: head_ref(some_ref_cb); which does not use the iterator mechanism (though in practice, HEAD should never peel to anything, so this may not be triggerable). - it must take a fully-qualified refname for the read_ref_full() code path to work. Yet we routinely pass it partial refnames from callbacks to for_each_tag_ref(), etc. This happens to work when iterating because there we do not call read_ref_full() at all, and only use the passed refname to check if it is the same as the iterator. But the requirements for the function parameters are quite unclear. Instead of taking a refname, let's instead take an oid. That fixes both problems. It's a little funny for a "ref" function not to involve refs at all. The key thing is that it's optimizing under the hood based on having access to the ref iterator. So let's change the name to make it clear why you'd want this function versus just peel_object(). There are two other directions I considered but rejected: - we could pass the peel information into the each_ref_fn callback. However, we don't know if the caller actually wants it or not. For packed-refs, providing it is essentially free. But for loose refs, we actually have to peel the object, which would be wasteful in most cases. We could likewise pass in a flag to the callback indicating whether the peeled information is known, but that complicates those callbacks, as they then have to decide whether to manually peel themselves. Plus it requires changing the interface of every callback, whether they care about peeling or not, and there are many of them. - we could make a function to return the peeled value of the current iterated ref (computing it if necessary), and BUG() otherwise. I.e.: int peel_current_iterated_ref(struct object_id *out); Each of the current callers is an each_ref_fn callback, so they'd mostly be happy. But: - we use those callbacks with functions like head_ref(), which do not use the iteration code. So we'd need to handle the fallback case there, anyway. - it's possible that a caller would want to call into generic code that sometimes is used during iteration and sometimes not. This encapsulates the logic to do the fast thing when possible, and fallback when necessary. The implementation is mostly obvious, but I want to call out a few things in the patch: - the test-tool coverage for peel_ref() is now meaningless, as it all collapses to a single peel_object() call (arguably they were pretty uninteresting before; the tricky part of that function is the fast-path we see during iteration, but these calls didn't trigger that). I've just dropped it entirely, though note that some other tests relied on the tags we created; I've moved that creation to the tests where it matters. - we no longer need to take a ref_store parameter, since we'd never look up a ref now. We do still rely on a global "current iterator" variable which _could_ be kept per-ref-store. But in practice this is only useful if there are multiple recursive iterations, at which point the more appropriate solution is probably a stack of iterators. No caller used the actual ref-store parameter anyway (they all call the wrapper that passes the_repository). - the original only kicked in the optimization when the "refname" pointer matched (i.e., not string comparison). We do likewise with the "oid" parameter here, but fall back to doing an actual oideq() call. This in theory lets us kick in the optimization more often, though in practice no current caller cares. It should never be wrong, though (peeling is a property of an object, so two refs pointing to the same object would peel identically). - the original took care not to touch the peeled out-parameter unless we found something to put in it. But no caller cares about this, and anyway, it is enforced by peel_object() itself (and even in the optimized iterator case, that's where we eventually end up). We can shorten the code and avoid an extra copy by just passing the out-parameter through the stack. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-21 15:51:31 -08:00
Ævar Arnfjörð Bjarmason	73c01d25fe	tests: remove uses of GIT_TEST_GETTEXT_POISON=false As noted in previous commits we are removing the use of GIT_TEST_GETTEXT_POISON=false. These tests all relied on the facility being off, it always is off after an earlier change, but we hadn't removed the redundant assignments to "false" in the tests. I'm preserving the deletion of "error" lines in `38b9197a76` (t5411: add basic test cases for proc-receive hook, 2020-08-27), it turns out that's useful even without GIT_TEST_GETTEXT_POISON=true in play. Update a comment added in that commit to note that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-21 15:50:03 -08:00
Ævar Arnfjörð Bjarmason	d162b25f95	tests: remove support for GIT_TEST_GETTEXT_POISON This removes the ability to inject "poison" gettext() messages via the GIT_TEST_GETTEXT_POISON special test setup. I initially added this as a compile-time option in `bb946bba76` (i18n: add GETTEXT_POISON to simulate unfriendly translator, 2011-02-22), and most recently modified to be toggleable at runtime in `6cdccfce1e` (i18n: make GETTEXT_POISON a runtime option, 2018-11-08).. The reason for its removal is that the trade-off of maintaining it v.s. what it's getting us has long since flipped. When gettext was integrated in `5e9637c629` (i18n: add infrastructure for translating Git with gettext, 2011-11-18) there was understandable concern on the Git ML that in marking messages for translation en-masse we'd inadvertently mark plumbing messages. The GETTEXT_POISON facility was a way to smoke those out via our test suite. Nowadays however we're done (or almost entirely done) with any marking of messages for translation. New messages are usually marked by their authors, who'll know whether it makes sense to translate them or not. If not any errors in marking the messages are much more likely to be spotted in review than in the the initial deluge of i18n patches in the 2011-2012 era. So let's just remove this. This leaves the test suite in a state where we still have a lot of test_i18n, C_LOCALE_OUTPUT etc. uses. Subsequent commits will remove those too. The change to t/lib-rebase.sh is a selective revert of the relevant part of `f2d17068fd` (i18n: rebase-interactive: mark comments of squash for translation, 2016-06-17), and the comment in t/t3406-rebase-message.sh is from `c7108bf9ed` (i18n: rebase: mark messages for translation, 2012-07-25). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-21 15:50:01 -08:00
Ævar Arnfjörð Bjarmason	6c280b4142	ci: remove GETTEXT_POISON jobs A subsequent commit will remove GETTEXT_POISON entirely, let's start by removing the CI jobs that enable the option. We cannot just remove the job because the CI is implicitly depending on the "poison" job being a sort of "default" job in the sense that it's the job that was otherwise run with the default compiler, no other GIT_TEST_* options etc. So let's keep it under the name "linux-gcc-default". This means we can remove the initial "make test" from the "linux-gcc" job (it does another one after setting a bunch of GIT_TEST_* variables). I'm not doing that because it would conflict with the in-flight `334afbc76f` (tests: mark tests relying on the current default for `init.defaultBranch`, 2020-11-18) (currently on the "seen" branch, so the SHA-1 will almost definitely change). It's going to use that "make test" again for different reasons, so let's preserve it for now. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-21 15:50:00 -08:00
Junio C Hamano	fe2f4d0031	Merge branch 'en/ort-directory-rename' into en/merge-ort-perf * en/ort-directory-rename: (28 commits) merge-ort: fix a directory rename detection bug merge-ort: process_renames() now needs more defensiveness merge-ort: implement apply_directory_rename_modifications() merge-ort: add a new toplevel_dir field merge-ort: implement handle_path_level_conflicts() merge-ort: implement check_for_directory_rename() merge-ort: implement apply_dir_rename() and check_dir_renamed() merge-ort: implement compute_collisions() merge-ort: modify collect_renames() for directory rename handling merge-ort: implement handle_directory_level_conflicts() merge-ort: implement compute_rename_counts() merge-ort: copy get_renamed_dir_portion() from merge-recursive.c merge-ort: add outline of get_provisional_directory_renames() merge-ort: add outline for computing directory renames merge-ort: collect which directories are removed in dirs_removed merge-ort: initialize and free new directory rename data structures merge-ort: add new data structures for directory rename detection merge-ort: add implementation of type-changed rename handling merge-ort: add implementation of normal rename handling merge-ort: add implementation of rename collisions ...	2021-01-20 22:52:50 -08:00
Elijah Newren	203c872c4f	merge-ort: fix a directory rename detection bug As noted in commit `902c521a35` ("t6423: more involved directory rename test", 2020-10-15), when we have a case where * dir/subdir/ has several files * almost all files in dir/subdir/ are renamed to folder/subdir/ * one of the files in dir/subdir/ is renamed to folder/subdir/newsubdir/ * the other side of history (that doesn't do the renames) adds a new file to dir/subdir/ Then for the majority of the file renames, the directory rename of dir/subdir/ -> folder/subdir/ is actually not represented that way but as dir/ -> folder/ We also had one rename that was represented as dir/subdir/ -> folder/subdir/newsubdir/ Now, since there's a new file in dir/subdir/, where does it go? Well, there's only one rule for dir/subdir/, so the code previously noted that this rule had the "majority" of the one "relevant" rename and thus erroneously used it to place the file in folder/subdir/newsubdir/. We really want the heavy weight associated with dir/ -> folder/ to also be treated as dir/subdir/ -> folder/subdir/, so that we correctly place the file in folder/subdir/. Add a bunch of logic to make sure that we use all relevant renamings in directory rename detection. Note that testcase 12f of t6423 still fails after this, but it gets further than merge-recursive does. There are some performance related bits in that testcase (the region_enter messages) that do not yet succeed, but the rest of the testcase works after this patch. Subsequent patch series will fix up the performance side. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	1b6b902d95	merge-ort: process_renames() now needs more defensiveness Since directory rename detection adds new paths to opt->priv->paths and removes old ones, process_renames() needs to now check whether pair->one->path actually exists in opt->priv->paths instead of just assuming it does. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	089d82bc18	merge-ort: implement apply_directory_rename_modifications() This function roughly follows the same outline as the function of the same name from merge-recursive.c, but the code diverges in multiple ways due to some special considerations: * merge-ort's version needs to update opt->priv->paths with any new paths (and opt->priv->paths points to struct conflict_infos which track quite a bit of metadata for each path); merge-recursive's version would directly update the index * merge-ort requires that opt->priv->paths has any leading directories of any relevant files also be included in the set of paths. And due to pointer equality requirements on merged_info.directory_name, we have to be careful how we compute and insert these. * due to the above requirements on opt->priv->paths, merge-ort's version starts with a long comment to explain all the special considerations that need to be handled * merge-ort can use the full data stored in opt->priv->paths to avoid making expensive get_tree_entry() calls to regather the necessary data. * due to messages being deferred automatically in merge-ort, this is the best place to handle conflict messages whereas in merge-recursive.c they are deferred manually so that processing of entries does all the printing Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	05b85c6eeb	merge-ort: add a new toplevel_dir field Due to the string-equality-iff-pointer-equality requirements placed on merged_info.directory_name, apply_directory_rename_modifications() will need to have access to the exact toplevel directory name string pointer and can't just use a new empty string. Store it in a field that we can use. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	bea433655a	merge-ort: implement handle_path_level_conflicts() This is copied from merge-recursive.c, with minor tweaks due to: * using strmap API * merge-ort not using the non_unique_new_dir field, since it'll obviate its need entirely later with performance improvements * adding a new path_in_way() function that uses opt->priv->paths instead of doing an expensive tree_has_path() lookup to see if a tree has a given path. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	47325e8533	merge-ort: implement check_for_directory_rename() This is copied from merge-recursive.c, with minor tweaks due to using strmap API and the fact that it can use opt->priv->paths to get all pathnames that exist instead of taking a tree object. This depends on a new function, handle_path_level_conflicts(), which just has a placeholder die-not-yet-implemented implementation for now; a subsequent patch will implement it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	fbcfc0cc17	merge-ort: implement apply_dir_rename() and check_dir_renamed() Both of these are copied from merge-recursive.c, with just minor tweaks due to using strmap API and not having a non_unique_new_dir field. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	d9d015df4a	merge-ort: implement compute_collisions() This is nearly a wholesale copy of compute_collisions() from merge-recursive.c, and the logic remains the same, but it has been tweaked slightly due to: * using strmap.h API (instead of direct hashmaps) * allocation/freeing of data structures were done separately in merge_start() and clear_or_reinit_internal_opts() in an earlier patch in this series * there is no non_unique_new_dir data field in merge-ort; that will be handled a different way It does depend on two new functions, apply_dir_rename() and check_dir_renamed() which were introduced with simple die-not-yet-implemented shells and will be implemented in subsequent patches. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	fa5e06d690	merge-ort: modify collect_renames() for directory rename handling collect_renames() is similar to merge-recursive.c's get_renames(), but lacks the directory rename handling found in the latter. Port that code structure over to merge-ort. This introduces three new die-not-yet-implemented functions that will be defined in future commits. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	98d0d08128	merge-ort: implement handle_directory_level_conflicts() This is modelled on the version of handle_directory_level_conflicts() from merge-recursive.c, but is massively simplified due to the following factors: * strmap API provides simplifications over using direct hashmap * we have a dirs_removed field in struct rename_info that we have an easy way to populate from collect_merge_info(); this was already used in compute_rename_counts() and thus we do not need to check for condition #2. * The removal of condition #2 by handling it earlier in the code also obviates the need to check for condition #3 -- if both sides renamed a directory, meaning that the directory no longer exists on either side, then neither side could have added any new files to that directory, and thus there are no files whose locations we need to move due to such a directory rename. In fact, the same logic that makes condition #3 irrelevant means condition #1 is also irrelevant so we could drop this function. However, it is cheap to check if both sides rename the same directory, and doing so can save future computation. So, simply remove any directories that both sides renamed from the list of directory renames. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	2f620a4f19	merge-ort: implement compute_rename_counts() This function is based on the first half of get_directory_renames() from merge-recursive.c; as part of the implementation, factor out a routine, increment_count(), to update the bookkeeping to track the number of items renamed into new directories. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	9fe37e7bb9	merge-ort: copy get_renamed_dir_portion() from merge-recursive.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	04264d4079	merge-ort: add outline of get_provisional_directory_renames() This function is based on merge-recursive.c's get_directory_renames(), except that the first half has been split out into a not-yet-implemented compute_rename_counts(). The primary difference here is our lack of the non_unique_new_dir boolean in our strmap. The lack of that field will at first cause us to fail testcase 2b of t6423; however, future optimizations will obviate the need for that ugly field so we have just left it out. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Elijah Newren	112e11126b	merge-ort: add outline for computing directory renames Port some directory rename handling changes from merge-recursive.c's detect_and_process_renames() to the same-named function of merge-ort.c. This does not yet add any use or handling of directory renames, just the outline for where we start to compute them. Thus, a future patch will add port additional changes to merge-ort's detect_and_process_renames(). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 22:18:55 -08:00
Derrick Stolee	3cf5f221be	t7900: clean up some broken refs The tests for the 'prefetch' task create remotes and fetch refs into 'refs/prefetch/<remote>/' and tags into 'refs/tags/'. These tests use the remotes to create objects not intended to be seen by the "local" repository. In that sense, the incrmental-repack tasks did not have these objects and refs in mind. That test replaces the object directory with a specific pack-file layout for testing the batch-size logic. However, this causes some operations to start showing warnings such as: error: refs/prefetch/remote1/one does not point to a valid object! error: refs/tags/one does not point to a valid object! This only shows up if you run the tests verbosely and watch the output. It caught my eye and I _thought_ that there was a bug where 'git gc' or 'git repack' wouldn't check 'refs/prefetch/' before pruning objects. That is incorrect. Those commands do handle 'refs/prefetch/' correctly. All that is left is to clean up the tests in t7900-maintenance.sh to remove these tags and refs that are not being repacked for the incremental-repack tests. Use update-ref to ensure this works with all ref backends. Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 18:46:22 -08:00
Derrick Stolee	96eaffebbf	maintenance: set log.excludeDecoration durin prefetch The 'prefetch' task fetches refs from all remotes and places them in the refs/prefetch/<remote>/ refspace. As this task is intended to run in the background, this allows users to keep their local data very close to the remote servers' data while not updating the users' understanding of the remote refs in refs/remotes/<remote>/. However, this can clutter 'git log' decorations with copies of the refs with the full name 'refs/prefetch/<remote>/<branch>'. The log.excludeDecoration config option was added in `a6be5e67` (log: add log.excludeDecoration config option, 2020-05-16) for exactly this purpose. Ensure we set this only for users that would benefit from it by assigning it at the beginning of the prefetch task. Other alternatives would be during 'git maintenance register' or 'git maintenance start', but those might assign the config even when the prefetch task is disabled by existing config. Further, users could run 'git maintenance run --task=prefetch' using their own scripting or scheduling. This provides the best coverage to automatically update the config when valuable. It is improbable, but possible, that users might want to run the prefetch task _and_ see these refs in their log decorations. This seems incredibly unlikely to me, but users can always opt-in on a command-by-command basis using --decorate-refs=refs/prefetch/. Test that this works in a few cases. In particular, ensure that our assignment of log.excludeDecoration=refs/prefetch/ is additive to other existing exclusions. Further, ensure we do not add multiple copies in multiple runs. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 18:46:22 -08:00
Phillip Wood	498bb5b82e	sequencer: factor out code to append squash message This code is going to grow over the next two commits so move it to its own function. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 17:50:11 -08:00
Phillip Wood	eab0df0e5b	rebase -i: only write fixup-message when it's needed The file "$GIT_DIR/rebase-merge/fixup-message" is only used for fixup commands, there's no point in writing it for squash commands as it is immediately deleted. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Charvi Mendiratta <charvi077@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 17:50:11 -08:00
brian m. carlson	1fb5cf0da6	commit: ignore additional signatures when parsing signed commits When we create a commit with multiple signatures, neither of these signatures includes the other. Consequently, when we produce the payload which has been signed so we can verify the commit, we must strip off any other signatures, or the payload will differ from what was signed. Do so, and in preparation for verifying with multiple algorithms, pass the algorithm we want to verify into parse_signed_commit. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 17:38:20 -08:00
brian m. carlson	83dff3eb2e	ref-filter: switch some uses of unsigned long to size_t In the future, we'll want to pass some of the arguments of find_subpos to strbuf_detach, which takes a size_t. This is fine on systems where that's the same size as unsigned long, but that isn't the case on all systems. Moreover, size_t makes sense since it's not possible to use a buffer here that's larger than memory anyway. Let's switch each use to size_t for these lengths in grab_sub_body_contents and find_subpos. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 17:38:19 -08:00
Abhishek Kumar	5a3b130cad	doc: add corrected commit date info With generation data chunk and corrected commit dates implemented, let's update the technical documentation for commit-graph. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	8d00d7c3df	commit-reach: use corrected commit dates in paint_down_to_common() `091f4cf` (commit: don't use generation numbers if not needed, 2018-08-30) changed paint_down_to_common() to use commit dates instead of generation numbers v1 (topological levels) as the performance regressed on certain topologies. With generation number v2 (corrected commit dates) implemented, we no longer have to rely on commit dates and can use generation numbers. For example, the command `git merge-base v4.8 v4.9` on the Linux repository walks 167468 commits, taking 0.135s for committer date and 167496 commits, taking 0.157s for corrected committer date respectively. While using corrected commit dates, Git walks nearly the same number of commits as commit date, the process is slower as for each comparision we have to access a commit-slab (for corrected committer date) instead of accessing struct member (for committer date). This change incidentally broke the fragile t6404-recursive-merge test. t6404-recursive-merge sets up a unique repository where all commits have the same committer date without a well-defined merge-base. While running tests with GIT_TEST_COMMIT_GRAPH unset, we use committer date as a heuristic in paint_down_to_common(). 6404.1 'combined merge conflicts' merges commits in the order: - Merge C with B to form an intermediate commit. - Merge the intermediate commit with A. With GIT_TEST_COMMIT_GRAPH=1, we write a commit-graph and subsequently use the corrected committer date, which changes the order in which commits are merged: - Merge A with B to form an intermediate commit. - Merge the intermediate commit with C. While resulting repositories are equivalent, 6404.4 'virtual trees were processed' fails with GIT_TEST_COMMIT_GRAPH=1 as we are selecting different merge-bases and thus have different object ids for the intermediate commits. As this has already causes problems (as noted in `859fdc0` (commit-graph: define GIT_TEST_COMMIT_GRAPH, 2018-08-29)), we disable commit graph within t6404-recursive-merge. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	1fdc383c5c	commit-graph: use generation v2 only if entire chain does Since there are released versions of Git that understand generation numbers in the commit-graph's CDAT chunk but do not understand the GDAT chunk, the following scenario is possible: 1. "New" Git writes a commit-graph with the GDAT chunk. 2. "Old" Git writes a split commit-graph on top without a GDAT chunk. If each layer of split commit-graph is treated independently, as it was the case before this commit, with Git inspecting only the current layer for chunk_generation_data pointer, commits in the lower layer (one with GDAT) whould have corrected commit date as their generation number, while commits in the upper layer would have topological levels as their generation. Corrected commit dates usually have much larger values than topological levels. This means that if we take two commits, one from the upper layer, and one reachable from it in the lower layer, then the expectation that the generation of a parent is smaller than the generation of a child would be violated. It is difficult to expose this issue in a test. Since we _start_ with artificially low generation numbers, any commit walk that prioritizes generation numbers will walk all of the commits with high generation number before walking the commits with low generation number. In all the cases I tried, the commit-graph layers themselves "protect" any incorrect behavior since none of the commits in the lower layer can reach the commits in the upper layer. This issue would manifest itself as a performance problem in this case, especially with something like "git log --graph" since the low generation numbers would cause the in-degree queue to walk all of the commits in the lower layer before allowing the topo-order queue to write anything to output (depending on the size of the upper layer). Therefore, When writing the new layer in split commit-graph, we write a GDAT chunk only if the topmost layer has a GDAT chunk. This guarantees that if a layer has GDAT chunk, all lower layers must have a GDAT chunk as well. Rewriting layers follows similar approach: if the topmost layer below the set of layers being rewritten (in the split commit-graph chain) exists, and it does not contain GDAT chunk, then the result of rewrite does not have GDAT chunks either. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	e8b63005c4	commit-graph: implement generation data chunk As discovered by Ævar, we cannot increment graph version to distinguish between generation numbers v1 and v2 [1]. Thus, one of pre-requistes before implementing generation number v2 was to distinguish between graph versions in a backwards compatible manner. We are going to introduce a new chunk called Generation DATa chunk (or GDAT). GDAT will store corrected committer date offsets whereas CDAT will still store topological level. Old Git does not understand GDAT chunk and would ignore it, reading topological levels from CDAT. New Git can parse GDAT and take advantage of newer generation numbers, falling back to topological levels when GDAT chunk is missing (as it would happen with a commit-graph written by old Git). We introduce a test environment variable 'GIT_TEST_COMMIT_GRAPH_NO_GDAT' which forces commit-graph file to be written without generation data chunk to emulate a commit-graph file written by old Git. To minimize the space required to store corrrected commit date, Git stores corrected commit date offsets into the commit-graph file, instea of corrected commit dates. This saves us 4 bytes per commit, decreasing the GDAT chunk size by half, but it's possible for the offset to overflow the 4-bytes allocated for storage. As such overflows are and should be exceedingly rare, we use the following overflow management scheme: We introduce a new commit-graph chunk, Generation Data OVerflow ('GDOV') to store corrected commit dates for commits with offsets greater than GENERATION_NUMBER_V2_OFFSET_MAX. If the offset is greater than GENERATION_NUMBER_V2_OFFSET_MAX, we set the MSB of the offset and the other bits store the position of corrected commit date in GDOV chunk, similar to how Extra Edge List is maintained. We test the overflow-related code with the following repo history: F - N - U / \ U - N - U N \ / N - F - N Where the commits denoted by U have committer date of zero seconds since Unix epoch, the commits denoted by N have committer date of 1112354055 (default committer date for the test suite) seconds since Unix epoch and the commits denoted by F have committer date of (2 ^ 31 - 2) seconds since Unix epoch. The largest offset observed is 2 ^ 31, just large enough to overflow. [1]: https://lore.kernel.org/git/87a7gdspo4.fsf@evledraar.gmail.com/ Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	c1a09119f6	commit-graph: implement corrected commit date With most of preparations done, let's implement corrected commit date. The corrected commit date for a commit is defined as: * A commit with no parents (a root commit) has corrected commit date equal to its committer date. * A commit with at least one parent has corrected commit date equal to the maximum of its commit date and one more than the largest corrected commit date among its parents. As a special case, a root commit with timestamp of zero (01.01.1970 00:00:00Z) has corrected commit date of one, to be able to distinguish from GENERATION_NUMBER_ZERO (that is, an uncomputed corrected commit date). To minimize the space required to store corrected commit date, Git stores corrected commit date offsets into the commit-graph file. The corrected commit date offset for a commit is defined as the difference between its corrected commit date and actual commit date. Storing corrected commit date requires sizeof(timestamp_t) bytes, which in most cases is 64 bits (uintmax_t). However, corrected commit date offsets can be safely stored using only 32-bits. This halves the size of GDAT chunk, which is a reduction of around 6% in the size of commit-graph file. However, using offsets be problematic if a commit is malformed but valid and has committer date of 0 Unix time, as the offset would be the same as corrected commit date and thus require 64-bits to be stored properly. While Git does not write out offsets at this stage, Git stores the corrected commit dates in member generation of struct commit_graph_data. It will begin writing commit date offsets with the introduction of generation data chunk. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	d7f92784c6	commit-graph: return 64-bit generation number In a preparatory step for introducing corrected commit dates, let's return timestamp_t values from commit_graph_generation(), use timestamp_t for local variables and define GENERATION_NUMBER_INFINITY as (2 ^ 63 - 1) instead. We rename GENERATION_NUMBER_MAX to GENERATION_NUMBER_V1_MAX to represent the largest topological level we can store in the commit data chunk. With corrected commit dates implemented, we will have two such *_MAX variables to denote the largest offset and largest topological level that can be stored. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	72a2bfcaf0	commit-graph: add a slab to store topological levels In a later commit we will introduce corrected commit date as the generation number v2. Corrected commit dates will be stored in the new seperate Generation Data chunk. However, to ensure backwards compatibility with "Old" Git we need to continue to write generation number v1 (topological levels) to the commit data chunk. Thus, we need to compute and store both versions of generation numbers to write the commit-graph file. Therefore, let's introduce a commit-slab `topo_level_slab` to store topological levels; corrected commit date will be stored in the member `generation` of struct commit_graph_data. The macros `GENERATION_NUMBER_INFINITY` and `GENERATION_NUMBER_ZERO` mark commits not in the commit-graph file and commits written by a version of Git that did not compute generation numbers respectively. Generation numbers are computed identically for both kinds of commits. A "slab-miss" should return `GENERATION_NUMBER_INFINITY` as the commit is not in the commit-graph file. However, since the slab is zero-initialized, it returns 0 (or rather `GENERATION_NUMBER_ZERO`). Thus, we no longer need to check if the topological level of a commit is `GENERATION_NUMBER_INFINITY`. We will add a pointer to the slab in `struct write_commit_graph_context` and `struct commit_graph` to populate the slab in `fill_commit_graph_info` if the commit has a pre-computed topological level as in case of split commit-graphs. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	c0ef139843	t6600-test-reach: generalize _three_modes In a preparatory step to implement generation number v2, we add tests to ensure Git can read and parse commit-graph files without Generation Data chunk. These files represent commit-graph files written by Old Git and are neccesary for backward compatability. We extend run_three_modes() and test_three_modes() to _all_modes() with the fourth mode being "commit-graph without generation data chunk". Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	f90fca638e	commit-graph: consolidate fill_commit_graph_info Both fill_commit_graph_info() and fill_commit_in_graph() parse information present in commit data chunk. Let's simplify the implementation by calling fill_commit_graph_info() within fill_commit_in_graph(). fill_commit_graph_info() used to not load committer data from commit data chunk. However, with the upcoming switch to using corrected committer date as generation number v2, we will have to load committer date to compute generation number value anyway. `e51217e15` (t5000: test tar files that overflow ustar headers, 30-06-2016) introduced a test 'generate tar with future mtime' that creates a commit with committer date of (2^36 + 1) seconds since EPOCH. The CDAT chunk provides 34-bits for storing committer date, thus committer time overflows into generation number (within CDAT chunk) and has undefined behavior. The test used to pass as fill_commit_graph_info() would not set struct member `date` of struct commit and load committer date from the object database, generating a tar file with the expected mtime. However, with corrected commit date, we will load the committer date from CDAT chunk (truncated to lower 34-bits to populate the generation number. Thus, Git sets date and generates tar file with the truncated mtime. The ustar format (the header format used by most modern tar programs) only has room for 11 (or 12, depending on some implementations) octal digits for the size and mtime of each file. As the CDAT chunk is overflow by 12-octal digits but not 11-octal digits, we split the existing tests to test both implementations separately and add a new explicit test for 11-digit implementation. To test the 11-octal digit implementation, we create a future commit with committer date of 2^34 - 1, which overflows 11-octal digits without overflowing 34-bits of the Commit Date chunks. To test the 12-octal digit implementation, the smallest committer date possible is 2^36 + 1, which overflows the CDAT chunk and thus commit-graph must be disabled for the test. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	2f9bbb6d91	revision: parse parent in indegree_walk_step() In indegree_walk_step(), we add unvisited parents to the indegree queue. However, parents are not guaranteed to be parsed. As the indegree queue sorts by generation number, let's parse parents before inserting them to ensure the correct priority order. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Abhishek Kumar	e30c5ee76c	commit-graph: fix regression when computing Bloom filters Before computing Bloom filters, the commit-graph machinery uses commit_gen_cmp to sort commits by generation order for improved diff performance. `3d11275505` (commit-graph: examine commits by generation number, 2020-03-30) claims that this sort can reduce the time spent to compute Bloom filters by nearly half. But since `c49c82aa4c` (commit: move members graph_pos, generation to a slab, 2020-06-17), this optimization is broken, since asking for a 'commit_graph_generation()' directly returns GENERATION_NUMBER_INFINITY while writing. Not all hope is lost, though: 'commit_gen_cmp()' falls back to comparing commits by their date when they have equal generation number, and so since `c49c82aa4c` is purely a date comparison function. This heuristic is good enough that we don't seem to loose appreciable performance while computing Bloom filters. Applying this patch (compared with v2.30.0) speeds up computing Bloom filters by factors ranging from 0.40% to 5.19% on various repositories [1]. So, avoid the useless 'commit_graph_generation()' while writing by instead accessing the slab directly. This returns the newly-computed generation numbers, and allows us to avoid the heuristic by directly comparing generation numbers. [1]: https://lore.kernel.org/git/20210105094535.GN8396@szeder.dev/ Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:17 -08:00
Derrick Stolee	a4b6d202ca	cache-tree: speed up consecutive path comparisons The previous change reduced time spent in strlen() while comparing consecutive paths in verify_cache(), but we can do better. The conditional checks the existence of a directory separator at the correct location, but only after doing a string comparison. Swap the order to be logically equivalent but perform fewer string comparisons. To test the effect on performance, I used a repository with over three million paths in the index. I then ran the following command on repeat: git -c index.threads=1 commit --amend --allow-empty --no-edit Here are the measurements over 10 runs after a 5-run warmup: Benchmark #1: v2.30.0 Time (mean ± σ): 854.5 ms ± 18.2 ms Range (min … max): 825.0 ms … 892.8 ms Benchmark #2: Previous change Time (mean ± σ): 833.2 ms ± 10.3 ms Range (min … max): 815.8 ms … 849.7 ms Benchmark #3: This change Time (mean ± σ): 815.5 ms ± 18.1 ms Range (min … max): 795.4 ms … 849.5 ms This change is 2% faster than the previous change and 5% faster than v2.30.0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:05:13 -08:00
René Scharfe	0b72536a0b	cache-tree: use ce_namelen() instead of strlen() Use the name length field of cache entries instead of calculating its value anew. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:05:13 -08:00
Derrick Stolee	4bdde337f4	index-format: discuss recursion of cache-tree better The end of the cache tree index extension format trails off with ellipses ever since `23fcc98` (doc: technical details about the index file format, 2011-03-01). While an intuitive reader could gather what this means, it could be better to use "and so on" instead. Really, this is only justified because I also wanted to point out that the number of subtrees in the index format is used to determine when the recursive depth-first-search stack should be "popped." This should help to add clarity to the format. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:04:59 -08:00
Derrick Stolee	22ad8600c1	index-format: update preamble to cache tree extension I had difficulty in my efforts to learn about the cache tree extension based on the documentation and code because I had an incorrect assumption about how it behaved. This might be due to some ambiguity in the documentation, so this change modifies the beginning of the cache tree format by expanding the description of the feature. My hope is that this documentation clarifies a few things: 1. There is an in-memory recursive tree structure that is constructed from the extension data. This structure has a few differences, such as where the name is stored. 2. What does it mean for an entry to be invalid? 3. When exactly are "new" trees created? Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:04:46 -08:00
Derrick Stolee	845d15d4d0	index-format: use 'cache tree' over 'cached tree' The index has a "cache tree" extension. This corresponds to a significant API implemented in cache-tree.[ch]. However, there are a few places that refer to this erroneously as "cached tree". These are rare, but notably the index-format.txt file itself makes this error. The only other reference is in t7104-reset-hard.sh. Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:04:38 -08:00
Derrick Stolee	0e5c950267	cache-tree: trace regions for prime_cache_tree Commands such as "git reset --hard" rebuild the in-memory representation of the cache tree index extension by parsing tree objects starting at a known root tree. The performance of this operation can vary widely depending on the width and depth of the repository's working directory structure. Measure the time in this operation using trace2 regions in prime_cache_tree(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:04:32 -08:00
Derrick Stolee	4c3e18723c	cache-tree: trace regions for I/O As we write or read the cache tree index extension, it can be good to isolate how much of the file I/O time is spent constructing this in-memory tree from the existing index or writing it out again to the new index file. Use trace2 regions to indicate that we are spending time on this operation. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:04:21 -08:00
Junio C Hamano	66e871b664	The third batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 21:48:47 -08:00
Junio C Hamano	49656f9445	Merge branch 'jc/macos-install-dependencies-fix' Fix for procedure to building CI test environment for mac. * jc/macos-install-dependencies-fix: ci/install-depends: attempt to fix "brew cask" stuff	2021-01-15 21:48:47 -08:00
Junio C Hamano	8782bfbf01	Merge branch 'tb/local-clone-race-doc' Doc update. * tb/local-clone-race-doc: Documentation/git-clone.txt: document race with --local	2021-01-15 21:48:47 -08:00
Junio C Hamano	644d85e751	Merge branch 'bc/doc-status-short' Doc update. * bc/doc-status-short: docs: rephrase and clarify the git status --short format	2021-01-15 21:48:47 -08:00
Junio C Hamano	453e149c8a	Merge branch 'dl/p4-encode-after-kw-expansion' Text encoding fix for "git p4". * dl/p4-encode-after-kw-expansion: git-p4: fix syncing file types with pattern	2021-01-15 21:48:47 -08:00
Junio C Hamano	cf2870adda	Merge branch 'ab/gettext-charset-comment-fix' Comments update. * ab/gettext-charset-comment-fix: gettext.c: remove/reword a mostly-useless comment Makefile: remove a warning about old GETTEXT_POISON flag	2021-01-15 21:48:46 -08:00
Junio C Hamano	eecc5f0775	Merge branch 'ug/doc-lose-dircache' Doc update. * ug/doc-lose-dircache: doc: remove "directory cache" from man pages	2021-01-15 21:48:46 -08:00
Junio C Hamano	d9e1cd555d	Merge branch 'ad/t4129-setfacl-target-fix' Test fix. * ad/t4129-setfacl-target-fix: t4129: fix setfacl-related permissions failure	2021-01-15 21:48:46 -08:00
Junio C Hamano	2b8cef2307	Merge branch 'jk/t5516-deflake' Test fix. * jk/t5516-deflake: t5516: loosen "not our ref" error check	2021-01-15 21:48:46 -08:00
Junio C Hamano	788f488b33	Merge branch 'vv/send-email-with-less-secure-apps-access' Doc update. * vv/send-email-with-less-secure-apps-access: git-send-email.txt: mention less secure app access with Gmail	2021-01-15 21:48:46 -08:00
Junio C Hamano	073552d7ae	Merge branch 'pb/mergetool-tool-help-fix' Fix 2.29 regression where "git mergetool --tool-help" fails to list all the available tools. * pb/mergetool-tool-help-fix: mergetool--lib: fix '--tool-help' to correctly show available tools	2021-01-15 21:48:46 -08:00
Junio C Hamano	aa08688362	Merge branch 'ds/for-each-repo-noopfix' "git for-each-repo --config=<var> <cmd>" should not run <cmd> for any repository when the configuration variable <var> is not defined even once. * ds/for-each-repo-noopfix: for-each-repo: do nothing on empty config	2021-01-15 21:48:46 -08:00
Junio C Hamano	6a393f36d9	Merge branch 'jc/sign-off' Doc update. * jc/sign-off: SubmittingPatches: tighten wording on "sign-off" procedure	2021-01-15 21:48:45 -08:00
Junio C Hamano	8dbabb31df	Merge branch 'mt/t4129-with-setgid-dir' Some tests expect that "ls -l" output has either '-' or 'x' for group executable bit, but setgid bit can be inherited from parent directory and make these fields 'S' or 's' instead, causing test failures. * mt/t4129-with-setgid-dir: t4129: don't fail if setgid is set in the test directory	2021-01-15 21:48:45 -08:00
Junio C Hamano	b2ace18759	Merge branch 'ds/maintenance-part-4' Follow-up on the "maintenance part-3" which introduced scheduled maintenance tasks to support platforms whose native scheduling methods are not 'cron'. * ds/maintenance-part-4: maintenance: use Windows scheduled tasks maintenance: use launchctl on macOS maintenance: include 'cron' details in docs maintenance: extract platform-specific scheduling	2021-01-15 21:48:45 -08:00
Junio C Hamano	4151fdb1c7	The second batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 15:20:30 -08:00
Junio C Hamano	f9fb9063fd	Merge branch 'fc/completion-aliases-support' Bash completion (in contrib/) update to make it easier for end-users to add completion for their custom "git" subcommands. * fc/completion-aliases-support: completion: add proper public __git_complete test: completion: add tests for __git_complete completion: bash: improve function detection completion: bash: add __git_have_func helper	2021-01-15 15:20:30 -08:00
Junio C Hamano	62fb47a4d3	Merge branch 'en/stash-apply-sparse-checkout' "git stash" did not work well in a sparsely checked out working tree. * en/stash-apply-sparse-checkout: stash: fix stash application in sparse-checkouts stash: remove unnecessary process forking t7012: add a testcase demonstrating stash apply bugs in sparse checkouts	2021-01-15 15:20:29 -08:00
Junio C Hamano	1ee70a916d	Merge branch 'ar/t6016-modernise' Test update. * ar/t6016-modernise: t6016: move to lib-log-graph.sh framework	2021-01-15 15:20:29 -08:00
Junio C Hamano	2ce8de6bf9	Merge branch 'zh/arg-help-format' Clean up option descriptions in "git cmd --help". * zh/arg-help-format: builtin/*: update usage format parse-options: format argh like error messages	2021-01-15 15:20:29 -08:00
Junio C Hamano	7bfa022993	Merge branch 'nk/perf-fsmonitor-cleanup' Test fix. * nk/perf-fsmonitor-cleanup: p7519: allow running without watchman prereq	2021-01-15 15:20:29 -08:00
Junio C Hamano	02feca721e	Merge branch 'ds/trace2-topo-walk' The topological walk codepath is covered by new trace2 stats. * ds/trace2-topo-walk: revision: trace topo-walk statistics	2021-01-15 15:20:29 -08:00
Junio C Hamano	df26861c56	Merge branch 'rs/rebase-commit-validation' Diagnose command line error of "git rebase" early. * rs/rebase-commit-validation: rebase: verify commit parameter	2021-01-15 15:20:29 -08:00
Junio C Hamano	8b327f1784	Merge branch 'ma/sha1-is-a-hash' Retire more names with "sha1" in it. * ma/sha1-is-a-hash: hash-lookup: rename from sha1-lookup sha1-lookup: rename `sha1_pos()` as `hash_pos()` object-file.c: rename from sha1-file.c object-name.c: rename from sha1-name.c	2021-01-15 15:20:29 -08:00
Junio C Hamano	16a8055dae	Merge branch 'ma/doc-pack-format-varint-for-sizes' Doc update. * ma/doc-pack-format-varint-for-sizes: pack-format.txt: document sizes at start of delta data	2021-01-15 15:20:29 -08:00
Junio C Hamano	a11571bb7f	Merge branch 'ma/t1300-cleanup' Code clean-up. * ma/t1300-cleanup: t1300: don't needlessly work with `core.foo` configs t1300: remove duplicate test for `--file no-such-file` t1300: remove duplicate test for `--file ../foo`	2021-01-15 15:20:28 -08:00
Junio C Hamano	40876260ef	Merge branch 'pb/doc-modules-git-work-tree-typofix' Doc fix. * pb/doc-modules-git-work-tree-typofix: gitmodules.txt: fix 'GIT_WORK_TREE' variable name	2021-01-15 15:20:28 -08:00
Junio C Hamano	b17eb5b4e4	Merge branch 'ta/doc-typofix' Doc fix. * ta/doc-typofix: doc: fix some typos	2021-01-15 15:20:28 -08:00
Junio C Hamano	9ba366f12b	Merge branch 'bc/rev-parse-path-format' "git rev-parse" can be explicitly told to give output as absolute or relative path with the `--path-format=(absolute\|relative)` option. * bc/rev-parse-path-format: rev-parse: add option for absolute or relative path formatting abspath: add a function to resolve paths with missing components	2021-01-15 15:20:28 -08:00
Junio C Hamano	6dbbae17d9	Merge branch 'ew/decline-core-abbrev' The configuration variable 'core.abbrev' can be set to 'no' to force no abbreviation regardless of the hash algorithm. * ew/decline-core-abbrev: core.abbrev=no disables abbreviations	2021-01-15 15:20:28 -08:00
Patrick Steinhardt	d8d77153ea	config: allow specifying config entries via envvar pairs While we currently have the `GIT_CONFIG_PARAMETERS` environment variable which can be used to pass runtime configuration data to git processes, it's an internal implementation detail and not supposed to be used by end users. Next to being for internal use only, this way of passing config entries has a major downside: the config keys need to be parsed as they contain both key and value in a single variable. As such, it is left to the user to escape any potentially harmful characters in the value, which is quite hard to do if values are controlled by a third party. This commit thus adds a new way of adding config entries via the environment which gets rid of this shortcoming. If the user passes the `GIT_CONFIG_COUNT=$n` environment variable, Git will parse environment variable pairs `GIT_CONFIG_KEY_$i` and `GIT_CONFIG_VALUE_$i` for each `i` in `[0,n)`. While the same can be achieved with `git -c <name>=<value>`, one may wish to not do so for potentially sensitive information. E.g. if one wants to set `http.extraHeader` to contain an authentication token, doing so via `-c` would trivially leak those credentials via e.g. ps(1), which typically also shows command arguments. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 13:03:45 -08:00
Patrick Steinhardt	b9d147fb15	environment: make `getenv_safe()` a public function The `getenv_safe()` helper function helps to safely retrieve multiple environment values without the need to depend on platform-specific behaviour for the return value's lifetime. We'll make use of this function in a following patch, so let's make it available by making it non-static and adding a declaration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 13:03:45 -08:00
Patrick Steinhardt	1ff21c05ba	config: store "git -c" variables using more robust format The previous commit added a new format for $GIT_CONFIG_PARAMETERS which is able to robustly handle subsections with "=" in them. Let's start writing the new format. Unfortunately, this does much less than you'd hope, because "git -c" itself has the same ambiguity problem! But it's still worth doing: - we've now pushed the problem from the inter-process communication into the "-c" command-line parser. This would free us up to later add an unambiguous format there (e.g., separate arguments like "git --config key value", etc). - for --config-env, the parser already disallows "=" in the environment variable name. So: git --config-env section.with=equals.key=ENVVAR will robustly set section.with=equals.key to the contents of $ENVVAR. The new test shows the improvement for --config-env. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 13:03:18 -08:00
Jeff King	f9dbb64fad	config: parse more robust format in GIT_CONFIG_PARAMETERS When we stuff config options into GIT_CONFIG_PARAMETERS, we shell-quote each one as a single unit, like: 'section.one=value1' 'section.two=value2' On the reading side, we de-quote to get the individual strings, and then parse them by splitting on the first "=" we find. This format is ambiguous, because an "=" may appear in a subsection. So the config represented in a file by both: [section "subsection=with=equals"] key = value and: [section] subsection = with=equals.key=value ends up in this flattened format like: 'section.subsection=with=equals.key=value' and we can't tell which was desired. We have traditionally resolved this by taking the first "=" we see starting from the left, meaning that we allowed arbitrary content in the value, but not in the subsection. Let's make our environment format a bit more robust by separately quoting the key and value. That turns those examples into: 'section.subsection=with=equals.key'='value' and: 'section.subsection'='with=equals.key=value' respectively, and we can tell the difference between them. We can detect which format is in use for any given element of the list based on the presence of the unquoted "=". That means we can continue to allow the old format to work to support any callers which manually used the old format, and we can even intermingle the two formats. The old format wasn't documented, and nobody was supposed to be using it. But it's likely that such callers exist in the wild, so it's nice if we can avoid breaking them. Likewise, it may be possible to trigger an older version of "git -c" that runs a script that calls into a newer version of "git -c"; that new version would see the intermingled format. This does create one complication, which is that the obvious format in the new scheme for [section] some-bool is: 'section.some-bool' with no equals. We'd mistake that for an old-style variable. And it even has the same meaning in the old style, but: [section "with=equals"] some-bool does not. It would be: 'section.with=equals=some-bool' which we'd take to mean: [section] with = equals=some-bool in the old, ambiguous style. Likewise, we can't use: 'section.some-bool'='' because that's ambiguous with an actual empty string. Instead, we'll again use the shell-quoting to give us a hint, and use: 'section.some-bool'= to show that we have no value. Note that this commit just expands the reading side. We'll start writing the new format via "git -c" in a future patch. In the meantime, the existing "git -c" tests will make sure we didn't break reading the old format. But we'll also add some explicit coverage of the two formats to make sure we continue to handle the old one after we move the writing side over. And one final note: since we're now using the shell-quoting as a semantically meaningful hint, this closes the door to us ever allowing arbitrary shell quoting, like: 'a'shell'would'be'ok'with'this'.key=value But we have never supported that (only what sq_quote() would produce), and we are probably better off keeping things simple, robust, and backwards-compatible, than trying to make it easier for humans. We'll continue not to advertise the format of the variable to users, and instead keep "git -c" as the recommended mechanism for setting config (even if we are trying to be kind not to break users who may be relying on the current undocumented format). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 13:03:18 -08:00
Junio C Hamano	2d02bc91c0	t4203: make blame output massaging more robust In the "git blame --porcelain" output, lines that ends with three integers may not be the line that shows a commit object with line numbers and block length (the contents from the blamed file or the summary field can have a line that happens to match). Also, the names of the author may have more than three SP separated tokens ("git blame -L242,+1 `cf6de18aab` Documentation/SubmittingPatches" gives an example). The existing "grep -E \| cut" pipeline is a bit too loose on these two points. While they can be assumed on the test data, it is not so hard to use the right pattern from the documented format, so let's do so. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-14 21:54:52 -08:00
Philippe Blain	97f4b4c4e7	mailmap doc: use correct environment variable 'GIT_WORK_TREE' gitmailmap(5) uses 'GIT_WORK_DIR' to refer to the root of the repository, but this environment variable does not exist. Use the correct spelling for that variable, 'GIT_WORK_TREE'. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-14 21:54:06 -08:00
Jeff King	779412b9d9	for_each_object_in_pack(): clarify pack vs index ordering We may return objects in one of two orders: how they appear in the .idx (sorted by object id) or how they appear in the packfile itself. To further complicate matters, we have two ordering variables, "i" and "pos", and it is not clear to which order they apply. Let's clarify this by using an unambiguous name where possible, and leaving a comment for the variable that does double-duty. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-14 18:22:27 -08:00
Denton Liu	afa80f534b	t4203: stop losing return codes of git commands In a pipe, only the return code of the last command is used. Thus, all other commands will have their return codes masked. Rewrite pipes so that there are no git commands upstream so that their failure is reported. Signed-off-by: Denton Liu <liu.denton@gmail.com> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-14 18:21:21 -08:00
Denton Liu	f9f30a0310	test-lib-functions.sh: fix usage for test_commit() The usage comment for test_commit() shows that the --author option should be given as `--author=<author>`. However, this is incorrect as it only works when given as `--author <author>`. Correct this erroneous text. Also, for the sake of correctness, fix the description as well since we invoke `git commit` with `--author <author>`, not `--author=<author>`. Signed-off-by: Denton Liu <liu.denton@gmail.com> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-14 18:21:03 -08:00
Christian Couder	7c99bc23fc	pack-write: die on error in write_promisor_file() write_promisor_file() already uses xfopen(), so it would die if the file cannot be opened for writing. To be consistent with this behavior and not overlook issues, let's also die if there are errors when we are actually writing to the file. Suggested-by: Jeff King <peff@peff.net> Suggested-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-14 17:02:22 -08:00
Junio C Hamano	12aa5552a9	Merge branch 'en/ort-conflict-handling' into en/merge-ort-perf * en/ort-conflict-handling: merge-ort: add handling for different types of files at same path merge-ort: copy find_first_merges() implementation from merge-recursive.c merge-ort: implement format_commit() merge-ort: copy and adapt merge_submodule() from merge-recursive.c merge-ort: copy and adapt merge_3way() from merge-recursive.c merge-ort: flesh out implementation of handle_content_merge() merge-ort: handle book-keeping around two- and three-way content merge merge-ort: implement unique_path() helper merge-ort: handle directory/file conflicts that remain merge-ort: handle D/F conflict where directory disappears due to merge	2021-01-14 12:41:54 -08:00
Junio C Hamano	cafc587a1d	Merge branch 'en/diffcore-rename' into en/merge-ort-perf * en/diffcore-rename: diffcore-rename: remove unnecessary duplicate entry checks diffcore-rename: accelerate rename_dst setup diffcore-rename: simplify and accelerate register_rename_src() t4058: explore duplicate tree entry handling in a bit more detail t4058: add more tests and documentation for duplicate tree entry handling diffcore-rename: reduce jumpiness in progress counters diffcore-rename: simplify limit check diffcore-rename: avoid usage of global in too_many_rename_candidates() diffcore-rename: rename num_create to num_destinations	2021-01-14 12:41:45 -08:00
Taylor Blau	e5dcd78418	pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()' To prepare for on-disk reverse indexes, remove a spot in 'offset_to_pack_pos()' that looks at the 'revindex' array in 'struct packed_git'. Even though this use of the revindex pointer is within pack-revindex.c, this clean up is still worth doing. Since the 'revindex' pointer will be NULL when reading from an on-disk reverse index (instead the 'revindex_data' pointer will be mmaped to the 'pack-*.rev' file), this call-site would have to include a conditional to lookup the offset for position 'mi' each iteration through the search. So instead of open-coding 'pack_pos_to_offset()', call it directly from within 'offset_to_pack_pos()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:48 -08:00
Taylor Blau	d5bc7c60c7	pack-revindex: hide the definition of 'revindex_entry' Now that all spots outside of pack-revindex.c that reference 'struct revindex_entry' directly have been removed, it is safe to hide the implementation by moving it from pack-revindex.h to pack-revindex.c. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:48 -08:00
Taylor Blau	8389855a9b	pack-revindex: remove unused 'find_revindex_position()' Now that all 'find_revindex_position()' callers have been removed (and converted to the more descriptive 'offset_to_pack_pos()'), it is almost safe to get rid of 'find_revindex_position()' entirely. Almost, except for the fact that 'offset_to_pack_pos()' calls 'find_revindex_position()'. Inline 'find_revindex_position()' into 'offset_to_pack_pos()', and then remove 'find_revindex_position()' entirely. This is a straightforward refactoring with one minor snag. 'offset_to_pack_pos()' used to load the index before calling 'find_revindex_position()'. That means that by the time 'find_revindex_position()' starts executing, 'p->num_objects' can be safely read. After inlining, be careful to not read 'p->num_objects' until _after_ 'load_pack_revindex()' (which loads the index as a side-effect) has been called. Another small fix that is included is converting the upper- and lower-bounds to be unsigned's instead of ints. This dates back to `92e5c77c37` (revindex: export new APIs, 2013-10-24)--ironically, the last time we introduced new APIs here--but this unifies the types. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:48 -08:00
Taylor Blau	1c3855f33b	pack-revindex: remove unused 'find_pack_revindex()' Now that no callers of 'find_pack_revindex()' remain, remove the function's declaration and implementation entirely. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:47 -08:00
Taylor Blau	2891b434ac	builtin/gc.c: guess the size of the revindex 'estimate_repack_memory()' takes into account the amount of memory required to load the reverse index in memory by multiplying the assumed number of objects by the size of the 'revindex_entry' struct. Prepare for hiding the definition of 'struct revindex_entry' by removing a 'sizeof()' of that type from outside of pack-revindex.c. Instead, guess that one off_t and one uint32_t are required per object. Strictly speaking, this is a worse guess than asking for 'sizeof(struct revindex_entry)' directly, since the true size of this struct is 16 bytes with padding on the end of the struct in order to align the offset field. But, this is an approximation anyway, and it does remove a use of the 'struct revindex_entry' from outside of pack-revindex internals. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:47 -08:00
Taylor Blau	b130aef65e	for_each_object_in_pack(): convert to new revindex API Avoid looking at the 'revindex' pointer directly and instead call 'pack_pos_to_index()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:47 -08:00
Taylor Blau	0a7e3642bc	unpack_entry(): convert to new revindex API Remove direct manipulation of the 'struct revindex_entry' type as well as calls to the deprecated API in 'packfile.c:unpack_entry()'. Usual clean-up is performed (replacing '->nr' with calls to 'pack_pos_to_index()' and so on). Add an additional check to make sure that 'obj_offset()' points at a valid object. In the case this check is violated, we cannot call 'mark_bad_packed_object()' because we don't know the OID. At the top of the call stack is do_oid_object_info_extended() (via packed_object_info()), which does mark the object. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:47 -08:00
Taylor Blau	fc150caf67	packed_object_info(): convert to new revindex API Convert another call of 'find_pack_revindex()' to its replacement 'pack_pos_to_offset()'. Likewise: - Avoid manipulating `struct packed_git`'s `revindex` pointer directly by removing the pointer-as-array indexing. - Add an additional guard to check that the offset 'obj_offset()' points to a real object. This should be the case with well-behaved callers to 'packed_object_info()', but isn't guarenteed. Other blocks that fill in various other values from the 'struct object_info' request handle bad inputs by setting the type to 'OBJ_BAD' and jumping to 'out'. Do the same when given a bad offset here. The previous code would have segfaulted when given a bad 'obj_offset' value, since 'find_pack_revindex()' would return 'NULL', and then the line that fills 'oi->disk_sizep' would try to access 'NULL[1]' with a stride of 16 bytes (the width of 'struct revindex_entry)'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:47 -08:00
Taylor Blau	3a3f54dd0a	retry_bad_packed_offset(): convert to new revindex API Perform exactly the same conversion as in the previous commit to another caller within 'packfile.c'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:47 -08:00
Taylor Blau	45bef5c064	get_delta_base_oid(): convert to new revindex API Replace direct accesses to the 'struct revindex' type with a call to 'pack_pos_to_index()'. Likewise drop the old-style 'find_pack_revindex()' with its replacement 'offset_to_pack_pos()' (while continuing to perform the same error checking). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:46 -08:00
Taylor Blau	78232bf65d	rebuild_existing_bitmaps(): convert to new revindex API Remove another instance of looking at the revindex directly by instead calling 'pack_pos_to_index()'. Unlike other patches, this caller only cares about the index position of each object in the loop. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:46 -08:00
Taylor Blau	011f3fd5cd	try_partial_reuse(): convert to new revindex API Remove another instance of direct revindex manipulation by calling 'pack_pos_to_offset()' instead (the caller here does not care about the index position of the object at position 'pos'). Note that we cannot just use the existing "offset" variable to store the value we get from pack_pos_to_offset(). It is incremented by unpack_object_header(), but we later need the original value. Since we'll no longer have revindex->offset to read it from, we'll store that in a separate variable ("header" since it points to the entry's header bytes). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:46 -08:00
Taylor Blau	a78a90324d	get_size_by_pos(): convert to new revindex API Remove another caller that holds onto a 'struct revindex_entry' by replacing the direct indexing with calls to 'pack_pos_to_offset()' and 'pack_pos_to_index()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:46 -08:00
Taylor Blau	cf98f2e8e0	show_objects_for_type(): convert to new revindex API Avoid storing the revindex entry directly, since this structure will soon be removed from the public interface. Instead, store the offset and index position by calling 'pack_pos_to_offset()' and 'pack_pos_to_index()', respectively. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:46 -08:00
Taylor Blau	57665086af	bitmap_position_packfile(): convert to new revindex API Replace find_revindex_position() with its counterpart in the new API, offset_to_pack_pos(). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:45 -08:00
Taylor Blau	eb3fd99efd	check_object(): convert to new revindex API Replace direct accesses to the revindex with calls to 'offset_to_pack_pos()' and 'pack_pos_to_index()'. Since this caller already had some error checking (it can jump to the 'give_up' label if it encounters an error), we can easily check whether or not the provided offset points to an object in the given pack. This error checking existed prior to this patch, too, since the caller checks whether the return value from 'find_pack_revindex()' was NULL or not. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:45 -08:00
Taylor Blau	6a5c10c45f	write_reused_pack_verbatim(): convert to new revindex API Replace a direct access to the revindex array with 'pack_pos_to_offset()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:45 -08:00
Taylor Blau	66cbd3e2fb	write_reused_pack_one(): convert to new revindex API Replace direct revindex accesses with calls to 'pack_pos_to_offset()' and 'pack_pos_to_index()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:45 -08:00
Taylor Blau	952fc6870d	write_reuse_object(): convert to new revindex API First replace 'find_pack_revindex()' with its replacement 'offset_to_pack_pos()'. This prevents any bogus OFS_DELTA that may make its way through until 'write_reuse_object()' from causing a bad memory read (if 'revidx' is 'NULL') Next, replace a direct access of '->nr' with the wrapper function 'pack_pos_to_index()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:45 -08:00
Taylor Blau	f33fb6e419	pack-revindex: introduce a new API In the next several patches, we will prepare for loading a reverse index either in memory (mapping the inverse of the .idx's contents in-core), or directly from a yet-to-be-introduced on-disk format. To prepare for that, we'll introduce an API that avoids the caller explicitly indexing the revindex pointer in the packed_git structure. There are four ways to interact with the reverse index. Accordingly, four functions will be exported from 'pack-revindex.h' by the time that the existing API is removed. A caller may: 1. Load the pack's reverse index. This involves opening up the index, generating an array, and then sorting it. Since opening the index can fail, this function ('load_pack_revindex()') returns an int. Accordingly, it takes only a single argument: the 'struct packed_git' the caller wants to build a reverse index for. This function is well-suited for both the current and new API. Callers will have to continue to open the reverse index explicitly, but this function will eventually learn how to detect and load a reverse index from the on-disk format, if one exists. Otherwise, it will fallback to generating one in memory from scratch. 2. Convert a pack position into an offset. This operation is now called `pack_pos_to_offset()`. It takes a pack and a position, and returns the corresponding off_t. Any error simply calls BUG(), since the callers are not well-suited to handle a failure and keep going. 3. Convert a pack position into an index position. Same as above; this takes a pack and a position, and returns a uint32_t. This operation is known as `pack_pos_to_index()`. The same thinking about error conditions applies here as well. 4. Find the pack position for a given offset. This operation is now known as `offset_to_pack_pos()`. It takes a pack, an offset, and a pointer to a uint32_t where the position is written, if an object exists at that offset. Otherwise, -1 is returned to indicate failure. Unlike some of the callers that used to access '->offset' and '->nr' directly, the error checking around this call is somewhat more robust. This is important since callers should always pass an offset which points at the boundary of two objects. The API, unlike direct access, enforces that that is the case. This will become important in a subsequent patch where a caller which does not but could check the return value treats the signed `-1` from `find_revindex_position()` as an index into the 'revindex' array. Two design warts are carried over into the new API: - Asking for the index position of an out-of-bounds object will result in a BUG() (since no such object exists), but asking for the offset of the non-existent object at the end of the pack returns the total size of the pack. This makes it convenient for callers who always want to take the difference of two adjacent object's offsets (to compute the on-disk size) but don't want to worry about boundaries at the end of the pack. - offset_to_pack_pos() lazily loads the reverse index, but pack_pos_to_index() doesn't (callers of the former are well-suited to handle errors, but callers of the latter are not). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:44 -08:00
Ævar Arnfjörð Bjarmason	0d28d3cf33	CoC: update to version 2.0 + local changes Update the CoC added in `5cdf2301` (add a Code of Conduct document, 2019-09-24 from version 1.4 to version 2.0. This is the version found at [1] with the following minor changes: - We preserve the change to the CoC in `3f9ef874a7` (CODE_OF_CONDUCT: mention individual project-leader emails, 2019-09-26) - We preserve the custom intro added in `5cdf2301d4` (add a Code of Conduct document, 2019-09-24) This change intentionally preserves a warning emitted on "git diff --check". It's better to make it easily diff-able with upstream than to fix whitespace changes in our version while we're at it. 1. https://www.contributor-covenant.org/version/2/0/code_of_conduct/code_of_conduct.md Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Christian Couder <chriscool@tuxfamily.org> Acked-by: Derrick Stolee <dstolee@microsoft.com> Acked-by: Elijah Newren <newren@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylor.com> Acked-by: Jonathan Tan <jonathantanmy@google.com> Acked-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 17:45:04 -08:00
Christian Couder	33add2ad7d	fetch-pack: refactor writing promisor file Let's replace the 2 different pieces of code that write a promisor file in 'builtin/repack.c' and 'fetch-pack.c' with a new function called 'write_promisor_file()' in 'pack-write.c' and 'pack.h'. This might also help us in the future, if we want to put back the ref names and associated hashes that were in the promisor files we are repacking in 'builtin/repack.c' as suggested by a NEEDSWORK comment just above the code we are refactoring. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 16:01:07 -08:00
Christian Couder	9d7fa3be31	fetch-pack: rename helper to create_promisor_file() As we are going to refactor the code that actually writes the promisor file into a separate function in a following commit, let's rename the current write_promisor_file() function to create_promisor_file(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 16:01:07 -08:00
Ævar Arnfjörð Bjarmason	4e168333a8	shortlog: remove unused(?) "repo-abbrev" feature Remove support for the magical "repo-abbrev" comment in .mailmap files. This was added to .mailmap parsing in [1], as a generalized feature of the git-shortlog Perl script added earlier in [2]. There was no documentation or tests for this feature, and I don't think it's used in practice anymore. What it did was to allow you to specify a single string to be search-replaced with "/.../" in the .mailmap file. E.g. for linux.git's current .mailmap: git archive --remote=git@gitlab.com:linux-kernel/linux.git \ HEAD -- .mailmap \| grep -a repo-abbrev # repo-abbrev: /pub/scm/linux/kernel/git/ Then when running e.g.: git shortlog --merges --author=Linus -1 v5.10-rc7..v5.10 \| grep Merge We'd emit (the [...] is mine): Merge tag [...]git://git.kernel.org/.../tip/tip But will now emit: Merge tag [...]git.kernel.org/pub/scm/linux/kernel/git/tip/tip I think at this point this is just a historical artifact we can get rid of. It was initially meant for Linus's own use when we integrated the Perl script[2], but since then it seems he's stopped using it. Digging through Linus's release announcements on the LKML[3] the last release I can find that made use of this output is Linux 2.6.25-rc6 back in March 2008[4]. Later on Linus started using --no-merges[5], and nowadays seems to prefer some custom not-quite-shortlog format of merges from lieutenants[6]. You will still see it on linux.git if you run "git shortlog" manually yourself with --merges, with this removed you can still get the same output with: git log --pretty=fuller v5.10-rc7..v5.10 \| sed 's!/pub/scm/linux/kernel/git/!/.../!g' \| git shortlog Arguably we should do the same for the search-replacing of "[PATCH]" at the beginning with "". That seems to be another relic of a bygone era when linux.git patches would have their E-Mail subject lines applied as-is by "git am" or whatever. But we documented that feature in "git-shortlog(1)", and it seems more widely applicable than something purely kernel-specific. 1. `7595e2ee6e` (git-shortlog: make common repository prefix configurable with .mailmap, 2006-11-25) 2. `fa375c7f1b` (Add git-shortlog perl script, 2005-06-04) 3. https://lore.kernel.org/lkml/ 4. https://lore.kernel.org/lkml/alpine.LFD.1.00.0803161651350.3020@woody.linux-foundation.org/ 5. https://lore.kernel.org/lkml/BANLkTinrbh7Xi27an3uY7pDWrNKhJRYmEA@mail.gmail.com/ 6. https://lore.kernel.org/lkml/CAHk-=wg1+kf1AVzXA-RQX0zjM6t9J2Kay9xyuNqcFHWV-y5ZYw@mail.gmail.com/ Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason	238803cb40	mailmap doc + tests: document and test for case-insensitivity Add documentation and more tests for case-insensitivity. The existing test only matched on the E-Mail part, but as shown here we also match the name with strcasecmp(). This behavior was last discussed on the mailing list in the thread starting at [1]. It seems we're keeping it like this, so let's document it. 1. https://lore.kernel.org/git/87czykvg19.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason	34986b773a	mailmap tests: add tests for empty "<>" syntax Add tests for mailmap's handling of "<>", which is allowed on the RHS, but not the LHS of a "<LHS> <RHS>" pair. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason	9e2a14a889	mailmap tests: add tests for whitespace syntax Add tests for mailmap's handling of whitespace, i.e. how it trims space within "<>" and around author names. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason	9b391b09a0	mailmap tests: add a test for comment syntax Add a test for mailmap comment syntax. As noted in [1] there was no test coverage for this. Let's make sure a future change doesn't break it. 1. https://lore.kernel.org/git/CAN0heSoKYWXqskCR=GPreSHc6twCSo1345WTmiPdrR57XSShhA@mail.gmail.com/ Reported-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason	05b5ff219c	mailmap doc + tests: add better examples & test them Change the mailmap documentation added in `0925ce4d49` (Add map_user() and clear_mailmap() to mailmap, 2009-02-08) to continue discussing the Jane/Joe example. I think this makes things a lot less confusing as we're building up more complex examples using one set of data which covers all the things we'd like to discuss. Also add tests to assert that what our documentation says is what's actually happening. This is mostly (or entirely) covered by existing tests which I'm not deleting, but having these tests for the synopsis makes it easier to follow-along while reading the tests & docs. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason	f5d79bf7dd	tests: refactor a few tests to use "test_commit --append" Refactor a few more tests to use the new "--append" option to "test_commit". I added it for use in the mailmap tests, but this demonstrates how useful it is in general. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason	3373518cc8	test-lib functions: add an --append option to test_commit Add an --append option to test_commit to append <contents> to the <file> we're writing to. This simplifies a lot of test setup, as shown in some of the tests being changed here. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason	999cfc4f45	test-lib functions: add --author support to test_commit Add support for --author to "test_commit". This will simplify some current and future tests, one of those is being changed here. Let's also line-wrap the "git commit" command invocation to make diffs that add subsequent options easier to add, as they'll only need to add a new option line. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason	76b8b8d05c	test-lib functions: document arguments to test_commit The --notick argument was added in [1] and was followed by --signoff in [2], but neither of these commits added any documentation for these options. When -C was added in [3] a comment was added to document it, but not the other options. Let's document all of these options. 1. `44b85e89d7` (t7003: add test to filter a branch with a commit at epoch, 2012-07-12), 2. `5ed75e2a3f` (cherry-pick: don't forget -s on failure, 2012-09-14). 3. `6f94351b0a` (test-lib-functions.sh: teach test_commit -C <dir>, 2016-12-08) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason	f21426e189	test-lib functions: expand "test_commit" comment template Expand the comment template for "test_commit" to match that of "test_commit_bulk" added in `b1c36cb849` (test-lib: introduce test_commit_bulk, 2019-07-02). It has several undocumented options, which won't all fit on one line. Follow-up commit(s) will document them. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason	56ac194e1d	mailmap: test for silent exiting on missing file/blob That we silently ignore missing mailmap.file or mailmap.blob values is intentional. See `938a60d64f` (mailmap: clean up read_mailmap error handling, 2012-12-12). However, nothing tested for this. Let's do that by checking that stderr is empty in those cases. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason	c1fe7fd7e3	mailmap tests: get rid of overly complex blame fuzzing Change a test that used a custom fuzzing function since `bfdfa3d414` (t4203 (mailmap): stop hardcoding commit ids and dates, 2010-10-15) to just use the "blame --porcelain" output instead. We could use the same pattern as `0ba9c9a0fb` (t8008: rely on rev-parse'd HEAD instead of sha1 value, 2017-07-26) does to do this, but there wouldn't be any point. We're not trying to test "blame" output here in general, just that "blame" pays attention to the mailmap. So it's sufficient to get the blamed line(s) and authors from the output, which is much easier with the "--porcelain" option. It would still be possible for there to be a bug in "blame" such that it uses the mailmap for its "--porcelain" output, but not the regular output. Let's test for that simply by checking if specifying the mailmap changes the output. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason	400d160e39	mailmap tests: add a test for "not a blob" error Add a test for one of the error conditions added in `938a60d64f` (mailmap: clean up read_mailmap error handling, 2012-12-12). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason	fb3bbe4ea3	mailmap tests: remove redundant entry in test Remove a redundant line in a test added in `d20d654fe8` (Change current mailmap usage to do matching on both name and email of author/committer., 2009-02-08). This didn't conceivably test anything useful and is most likely a copy/paste error. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason	1db421ab85	mailmap tests: improve --stdin tests The --stdin tests setup the "contact" file in the main setup, let's instead set it up in the test that uses it. Also refactor the first test so it's obvious that the point of it is that "check-mailmap" will spew its input as-is when given no argument. For that one we can just use the "expect" file as-is. Also add tests for how other "--stdin" cases are handled, e.g. one where we actually do a mapping. For the rest of --stdin testing we just assume we're going to get the same output. We could follow-up and make sure everything's round-tripped through both --stdin and the file/blob backends, but I don't think there's much point in that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason	e9931ace4f	mailmap tests: modernize syntax & test idioms Refactor the mailmap tests to: * Setup "actual" test files in the body of "test_expect_success" * Don't have X of "test_expect_success X Y" be an unquoted string. * Not to carry over test config between tests, and instead use "test_config". * Replace various "echo" a line-at-a-time patterns with here-docs. * Change a case of "log.mailmap=False" to use the lower-case "false". Both work, but this ends up in git-config's boolean parsing and these atypical values are tested for elsewhere. Let's use the lower-case to not draw the reader's attention to this abnormality. * Remove commentary asserting that things work a given way in favor of simply testing for it, i.e. in the case of a .mailmap file outside of the repository. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason	9aaeac9cf7	mailmap tests: use our preferred whitespace syntax Change these tests to use the preferred whitespace around ">", "<<-EOF" etc. This is an initial step in larger and more meaningful refactoring of the file, which makes a subsequent commit easier to read. I'm not changing the whitespace of "echo <str> > file" patterns to "echo <str> >file" because all of those will be changed to here-docs in a subsequent commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason	fcafb75382	mailmap doc: start by mentioning the comment syntax Mentioning the comment syntax and blank line support first is in line with how "git help config" describes its format. See `b8936cf060` (config.txt grammar, typo, and asciidoc fixes, 2006-06-08) for the paragraph I'm copying & amending from its documentation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason	6646cca892	check-mailmap doc: note config options Add a passing mention of the mailmap.file and mailmap.blob configuration options. Before this addition a reader of the "check-mailmap" manpage would have no idea that a custom map could be specified, unless they'd happen to e.g. come across it in the "config" manpage first. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason	4f2ee994f3	mailmap doc: quote config variables `like.this` Quote the mailmap.file and mailmap.blob configuration variables as `mailmap.file` and `mailmap.blob`, and link to git-config(1). This is in line with the preferred way of doing this in the rest of our documentation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason	42957af027	mailmap doc: create a new "gitmailmap(5)" man page Create a gitmailmap(5) page similar to how .gitmodules and .gitignore have their own pages at gitmodules(5) and gitignore(5). Now instead of "check-mailmap", "blame" and "shortlog" documentation including the description of the format we link to one canonical place. This makes things easier for readers, since in our manpage or web-based[1] output it's not clear that the "MAPPING AUTHORS" sections aren't subtly different, as opposed to just included. 1. https://git-scm.com/docs/git-check-mailmap Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:39 -08:00
Patrick Steinhardt	c7b190dabd	fetch: implement support for atomic reference updates When executing a fetch, then git will currently allocate one reference transaction per reference update and directly commit it. This means that fetches are non-atomic: even if some of the reference updates fail, others may still succeed and modify local references. This is fine in many scenarios, but this strategy has its downsides. - The view of remote references may be inconsistent and may show a bastardized state of the remote repository. - Batching together updates may improve performance in certain scenarios. While the impact probably isn't as pronounced with loose references, the upcoming reftable backend may benefit as it needs to write less files in case the update is batched. - The reference-update hook is currently being executed twice per updated reference. While this doesn't matter when there is no such hook, we have seen severe performance regressions when doing a git-fetch(1) with reference-transaction hook when the remote repository has hundreds of thousands of references. Similar to `git push --atomic`, this commit thus introduces atomic fetches. Instead of allocating one reference transaction per updated reference, it causes us to only allocate a single transaction and commit it as soon as all updates were received. If locking of any reference fails, then we abort the complete transaction and don't update any reference, which gives us an all-or-nothing fetch. Note that this may not completely fix the first of above downsides, as the consistent view also depends on the server-side. If the server doesn't have a consistent view of its own references during the reference negotiation phase, then the client would get the same inconsistent view the server has. This is a separate problem though and, if it actually exists, can be fixed at a later point. This commit also changes the way we write FETCH_HEAD in case `--atomic` is passed. Instead of writing changes as we go, we need to accumulate all changes first and only commit them at the end when we know that all reference updates succeeded. Ideally, we'd just do so via a temporary file so that we don't need to carry all updates in-memory. This isn't trivially doable though considering the `--append` mode, where we do not truncate the file but simply append to it. And given that we support concurrent processes appending to FETCH_HEAD at the same time without any loss of data, seeding the temporary file with current contents of FETCH_HEAD initially and then doing a rename wouldn't work either. So this commit implements the simple strategy of buffering all changes and appending them to the file on commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:15 -08:00
Patrick Steinhardt	d4c8db8f1b	fetch: allow passing a transaction to `s_update_ref()` The handling of ref updates is completely handled by `s_update_ref()`, which will manage the complete lifecycle of the reference transaction. This is fine right now given that git-fetch(1) does not support atomic fetches, so each reference gets its own transaction. It is quite inflexible though, as `s_update_ref()` only knows about a single reference update at a time, so it doesn't allow us to alter the strategy. This commit prepares `s_update_ref()` and its only caller `update_local_ref()` to allow passing an external transaction. If none is given, then the existing behaviour is triggered which creates a new transaction and directly commits it. Otherwise, if the caller provides a transaction, then we only queue the update but don't commit it. This optionally allows the caller to manage when a transaction will be committed. Given that `update_local_ref()` is always called with a `NULL` transaction for now, no change in behaviour is expected from this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:15 -08:00
Patrick Steinhardt	c45889f104	fetch: refactor `s_update_ref` to use common exit path The cleanup code in `s_update_ref()` is currently duplicated for both succesful and erroneous exit paths. This commit refactors the function to have a shared exit path for both cases to remove the duplication. Suggested-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:15 -08:00
Patrick Steinhardt	929d044575	fetch: use strbuf to format FETCH_HEAD updates This commit refactors `append_fetch_head()` to use a `struct strbuf` for formatting the update which we're about to append to the FETCH_HEAD file. While the refactoring doesn't have much of a benefit right now, it serves as a preparatory step to implement atomic fetches where we need to buffer all updates to FETCH_HEAD and only flush them out if all reference updates succeeded. No change in behaviour is expected from this commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:14 -08:00
Patrick Steinhardt	58a646a368	fetch: extract writing to FETCH_HEAD When performing a fetch with the default `--write-fetch-head` option, we write all updated references to FETCH_HEAD while the updates are performed. Given that updates are not performed atomically, it means that we we write to FETCH_HEAD even if some or all of the reference updates fail. Given that we simply update FETCH_HEAD ad-hoc with each reference, the logic is completely contained in `store_update_refs` and thus quite hard to extend. This can already be seen by the way we skip writing to the FETCH_HEAD: instead of having a conditional which simply skips writing, we instead open "/dev/null" and needlessly write all updates there. We are about to extend git-fetch(1) to accept an `--atomic` flag which will make the fetch an all-or-nothing operation with regards to the reference updates. This will also require us to make the updates to FETCH_HEAD an all-or-nothing operation, but as explained doing so is not easy with the current layout. This commit thus refactors the wa we write to FETCH_HEAD and pulls out the logic to open, append to, commit and close the file. While this may seem rather over-the top at first, pulling out this logic will make it a lot easier to update the code in a subsequent commit. It also allows us to easily skip writing completely in case `--no-write-fetch-head` was passed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:14 -08:00
Patrick Steinhardt	b342ae61b3	config: extract function to parse config pairs The function `git_config_parse_parameter` is responsible for parsing a `foo.bar=baz`-formatted configuration key, sanitizing the key and then processing it via the given callback function. Given that we're about to add a second user which is going to process keys which already has keys and values separated, this commit extracts a function `config_parse_pair` which only does the sanitization and processing part as a preparatory step. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:03:18 -08:00
Jeff King	13c44953fb	quote: make sq_dequote_step() a public function We provide a function for dequoting an entire string, as well as one for handling a space-separated list of quoted strings. But there's no way for a caller to parse a string like 'foo'='bar', even though it is easy to generate one using sq_quote_buf() or similar. Let's make the single-step function available to callers outside of quote.c. Note that we do need to adjust its implementation slightly: it insists on seeing whitespace between items, and we'd like to be more flexible than that. Since it only has a single caller, we can move that check (and slurping up any extra whitespace) into that caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:03:18 -08:00
Patrick Steinhardt	ce81b1da23	config: add new way to pass config via `--config-env` While it's already possible to pass runtime configuration via `git -c <key>=<value>`, it may be undesirable to use when the value contains sensitive information. E.g. if one wants to set `http.extraHeader` to contain an authentication token, doing so via `-c` would trivially leak those credentials via e.g. ps(1), which typically also shows command arguments. To enable this usecase without leaking credentials, this commit introduces a new switch `--config-env=<key>=<envvar>`. Instead of directly passing a value for the given key, it instead allows the user to specify the name of an environment variable. The value of that variable will then be used as value of the key. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:03:18 -08:00
Jiang Xin	5bb0fd2cab	bundle: arguments can be read from stdin In order to create an incremental bundle, we need to pass many arguments to let git-bundle ignore some already packed commits. It will be more convenient to pass args via stdin. But the current implementation does not allow us to do this. This is because args are parsed twice when creating bundle. The first time for parsing args is in `compute_and_write_prerequisites()` by running `git-rev-list` command to write prerequisites in bundle file, and stdin is consumed in this step if "--stdin" option is provided for `git-bundle`. Later nothing can be read from stdin when running `setup_revisions()` in `create_bundle()`. The solution is to parse args once by removing the entire function `compute_and_write_prerequisites()` and then calling function `setup_revisions()`. In order to write prerequisites for bundle, will call `prepare_revision_walk()` and `traverse_commit_list()`. But after calling `prepare_revision_walk()`, the object array `revs.pending` is left empty, and the following steps could not work properly with the empty object array (`revs.pending`). Therefore, make a copy of `revs` to `revs_copy` for later use right after calling `setup_revisions()`. The copy of `revs_copy` is not a deep copy, it shares the same objects with `revs`. The object array of `revs` has been cleared, but objects themselves are still kept. Flags of objects may change after calling `prepare_revision_walk()`, we can use these changed flags without calling the `git rev-list` command and parsing its output like the former implementation. Also add testcases for git bundle in t6020, which read args from stdin. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-11 21:50:41 -08:00
Jiang Xin	ce1d6d9f16	bundle: lost objects when removing duplicate pendings `git rev-list` will list one commit for the following command: $ git rev-list 'main^!' <tip-commit-of-main-branch> But providing the same rev-list args to `git bundle`, fail to create a bundle file. $ git bundle create - 'main^!' # v2 git bundle -<OID> <one-line-message> fatal: Refusing to create empty bundle. This is because when removing duplicate objects in function `object_array_remove_duplicates()`, one unique pending object which has the same name is deleted by mistake. The revision arg 'main^!' in the above example is parsed by `handle_revision_arg()`, and at lease two different objects will be appended to `revs.pending`, one points to the parent commit of the "main" branch, and the other points to the tip commit of the "main" branch. These two objects have the same name "main". Only one object is left with the name "main" after calling the function `object_array_remove_duplicates()`. And what's worse, when adding boundary commits into pending list, we use one-line commit message as names, and the arbitory names may surprise git-bundle. Only comparing objects themselves (".item") is also not good enough, because user may want to create a bundle with two identical objects but with different reference names, such as: "HEAD" and "refs/heads/main". Add new function `contains_object()` which compare both the address and the name of the object. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-11 21:50:41 -08:00
Jiang Xin	9901164d81	test: add helper functions for git-bundle Move git-bundle related functions from t5510 to a library, and this lib will be shared with a new testcase t6020 which finds a known breakage of "git-bundle". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-11 21:50:41 -08:00
Denton Liu	6436a20284	refs: allow @{n} to work with n-sized reflog This sequence works $ git checkout -b newbranch $ git commit --allow-empty -m one $ git show -s newbranch@{1} and shows the state that was immediately after the newbranch was created. But then if you do $ git reflog expire --expire=now refs/heads/newbranch $ git commit --allow=empty -m two $ git show -s newbranch@{1} you'd be scolded with fatal: log for 'newbranch' only has 1 entries While it is true that it has only 1 entry, we have enough information in that single entry that records the transition between the state in which the tip of the branch was pointing at commit 'one' to the new commit 'two' built on it, so we should be able to answer "what object newbranch was pointing at?". But we refuse to do so. Make @{0} the special case where we use the new side to look up that entry. Otherwise, look up @{n} using the old side of the (n-1)th entry of the reflog. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-11 14:13:50 -08:00
Denton Liu	95c2a71820	refs: factor out set_read_ref_cutoffs() This block of code is duplicated twice. In a future commit, it will be duplicated for a third time. Factor out the common functionality into set_read_ref_cutoffs(). In the case of read_ref_at_ent(), we are incrementing `cb->reccnt` at the beginning of the function. Move these to right before the return so that the `cb->reccnt - 1` is changed to `cb->reccnt` and it can be cleanly factored out into set_read_ref_cutoffs(). The duplication of the increment statements will be removed in a future patch. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-10 12:24:00 -08:00
SZEDER Gábor	e3f5da7e60	t7800-difftool: don't accidentally match tmp dirs In a bunch of test cases in 't7800-difftool.sh' we 'grep' for specific filenames in 'git difftool's output, and those test cases are prone to occasional failures because those filenames might be part of the name of difftool's temporary directory as well, e.g.: +git difftool --dir-diff --no-symlinks --extcmd ls v1 +grep sub output +test_line_count = 2 sub-output test_line_count: line count for sub-output != 2 /tmp/git-difftool.Ssubfq/left/: sub /tmp/git-difftool.Ssubfq/right/: sub error: last command exited with $?=1 not ok 50 - difftool --dir-diff v1 from subdirectory --no-symlinks Fix this by tightening the 'grep' patterns looking for those interesting filenames to match only lines where a filename stands on its own. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-09 13:40:32 -08:00
Elijah Newren	eb3e3e1ddf	merge-ort: collect which directories are removed in dirs_removed Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-07 15:30:03 -08:00
Elijah Newren	f5d9fbc2e9	merge-ort: initialize and free new directory rename data structures Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-07 15:30:03 -08:00
Elijah Newren	c09376d55f	merge-ort: add new data structures for directory rename detection Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-07 15:30:02 -08:00
Junio C Hamano	8f894b2263	Merge branch 'en/merge-ort-3' into en/ort-directory-rename * en/merge-ort-3: merge-ort: add implementation of type-changed rename handling merge-ort: add implementation of normal rename handling merge-ort: add implementation of rename collisions merge-ort: add implementation of rename/delete conflicts merge-ort: add implementation of both sides renaming differently merge-ort: add implementation of both sides renaming identically merge-ort: add basic outline for process_renames() merge-ort: implement compare_pairs() and collect_renames() merge-ort: implement detect_regular_renames() merge-ort: add initial outline for basic rename detection merge-ort: add basic data structures for handling renames	2021-01-07 15:29:49 -08:00
Junio C Hamano	72c4083ddf	The first batch in 2.31 cycle Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-06 23:33:44 -08:00
Junio C Hamano	d3aff11c3e	Merge branch 'es/perf-export-fix' Tweak unneeded recursion from a test framework helper function. * es/perf-export-fix: t/perf: avoid unnecessary test_export() recursion	2021-01-06 23:33:44 -08:00
Junio C Hamano	cf4b0714f7	Merge branch 'fc/t6030-bisect-reset-removes-auxiliary-files' A 3-year old test that was not testing anything useful has been corrected. * fc/t6030-bisect-reset-removes-auxiliary-files: test: bisect-porcelain: fix location of files	2021-01-06 23:33:44 -08:00
Junio C Hamano	8664fcb83b	Merge branch 'es/worktree-repair-both-moved' "git worktree repair" learned to deal with the case where both the repository and the worktree moved. * es/worktree-repair-both-moved: worktree: teach `repair` to fix multi-directional breakage	2021-01-06 23:33:44 -08:00
Junio C Hamano	45a177069f	Merge branch 'en/merge-ort-recursive' The ORT merge strategy learned to synthesize virtual ancestor tree by recursively merging multiple merge bases together, just like the recursive backend has done for years. * en/merge-ort-recursive: merge-ort: implement merge_incore_recursive() merge-ort: make clear_internal_opts() aware of partial clearing merge-ort: copy a few small helper functions from merge-recursive.c commit: move reverse_commit_list() from merge-recursive	2021-01-06 23:33:44 -08:00
Junio C Hamano	d3fa84d528	Merge branch 'fc/pull-merge-rebase' When a user does not tell "git pull" to use rebase or merge, the command gives a loud message telling a user to choose between rebase or merge but creates a merge anyway, forcing users who would want to rebase to redo the operation. Fix an early part of this problem by tightening the condition to give the message---there is no reason to stop or force the user to choose between rebase or merge if the history fast-forwards. * fc/pull-merge-rebase: pull: display default warning only when non-ff pull: correct condition to trigger non-ff advice pull: get rid of unnecessary global variable pull: give the advice for choosing rebase/merge much later pull: refactor fast-forward check	2021-01-06 23:33:44 -08:00
Junio C Hamano	85cf82ff01	Merge branch 'en/merge-ort-2' More "ORT" merge strategy. * en/merge-ort-2: merge-ort: add modify/delete handling and delayed output processing merge-ort: add die-not-implemented stub handle_content_merge() function merge-ort: add function grouping comments merge-ort: add a paths_to_free field to merge_options_internal merge-ort: add a path_conflict field to merge_options_internal merge-ort: add a clear_internal_opts helper merge-ort: add a few includes	2021-01-06 23:33:44 -08:00
Junio C Hamano	f9d29daba6	Merge branch 'en/merge-ort-impl' The merge backend "done right" starts to emerge. * en/merge-ort-impl: merge-ort: free data structures in merge_finalize() merge-ort: add implementation of record_conflicted_index_entries() tree: enable cmp_cache_name_compare() to be used elsewhere merge-ort: add implementation of checkout() merge-ort: basic outline for merge_switch_to_result() merge-ort: step 3 of tree writing -- handling subdirectories as we go merge-ort: step 2 of tree writing -- function to create tree object merge-ort: step 1 of tree writing -- record basenames, modes, and oids merge-ort: have process_entries operate in a defined order merge-ort: add a preliminary simple process_entries() implementation merge-ort: avoid recursing into identical trees merge-ort: record stage and auxiliary info for every path merge-ort: compute a few more useful fields for collect_merge_info merge-ort: avoid repeating fill_tree_descriptor() on the same tree merge-ort: implement a very basic collect_merge_info() merge-ort: add an err() function similar to one from merge-recursive merge-ort: use histogram diff merge-ort: port merge_start() from merge-recursive merge-ort: add some high-level algorithm structure merge-ort: setup basic internal data structures	2021-01-06 23:33:43 -08:00
Junio C Hamano	c256631065	Merge branch 'tb/pack-bitmap' Various improvements to the codepath that writes out pack bitmaps. * tb/pack-bitmap: (24 commits) pack-bitmap-write: better reuse bitmaps pack-bitmap-write: relax unique revwalk condition pack-bitmap-write: use existing bitmaps pack-bitmap: factor out 'add_commit_to_bitmap()' pack-bitmap: factor out 'bitmap_for_commit()' pack-bitmap-write: ignore BITMAP_FLAG_REUSE pack-bitmap-write: build fewer intermediate bitmaps pack-bitmap.c: check reads more aggressively when loading pack-bitmap-write: rename children to reverse_edges t5310: add branch-based checks commit: implement commit_list_contains() bitmap: implement bitmap_is_subset() pack-bitmap-write: fill bitmap with commit history pack-bitmap-write: pass ownership of intermediate bitmaps pack-bitmap-write: reimplement bitmap writing ewah: add bitmap_dup() function ewah: implement bitmap_or() ewah: make bitmap growth less aggressive ewah: factor out bitmap growth rev-list: die when --test-bitmap detects a mismatch ...	2021-01-06 23:33:43 -08:00
Junio C Hamano	b62bbd3580	Merge branch 'ab/trailers-extra-format' The "--format=%(trailers)" mechanism gets enhanced to make it easier to design output for machine consumption. * ab/trailers-extra-format: pretty format %(trailers): add a "key_value_separator" pretty format %(trailers): add a "keyonly" pretty-format %(trailers): fix broken standalone "valueonly" pretty format %(trailers) doc: avoid repetition pretty format %(trailers) test: split a long line	2021-01-06 23:33:43 -08:00
Junio C Hamano	c977ff4407	Merge branch 'pk/subsub-fetch-fix-take-2' "git fetch --recurse-submodules" fix (second attempt). * pk/subsub-fetch-fix-take-2: submodules: fix of regression on fetching of non-init subsub-repo	2021-01-06 23:33:43 -08:00
Patrick Steinhardt	b0812b6ac0	git: add `--super-prefix` to usage string When the `--super-prefix` option was implmented in `74866d7579` (git: make super-prefix option, 2016-10-07), its existence was only documented in the manpage but not in the command's own usage string. Given that the commit message didn't mention that this was done intentionally and given that it's documented in the manpage, this seems like an oversight. Add it to the usage string to fix the inconsistency. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-06 22:55:06 -08:00
Ævar Arnfjörð Bjarmason	06ce79152b	mktag: add a --[no-]strict option Now that mktag has been migrated to use the fsck machinery to check its input, it makes sense to teach it to run in the equivalent of "git fsck"'s default mode. For cases where mktag is used to (re)create a tag object using data from an existing and malformed tag object, the validation may optionally have to be loosened. Teach the command to take the "--[no-]strict" option to do so. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-06 14:22:24 -08:00
Ævar Arnfjörð Bjarmason	2aa9425fbe	mktag: mark strings for translation Mark the errors mktag might emit for translation. This is a plumbing command, but the errors it emits are intended to be human-readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	3f390a366c	mktag: convert to parse-options Convert the "mktag" command to use parse-options.h instead of its own ad-hoc argc handling. This doesn't matter much in practice since it doesn't support any options, but removes another special-case in our codebase, and makes it easier to add options to it in the future. It does marginally improve the situation for programs that want to execute git commands in a consistent manner and e.g. always use --end-of-options. E.g. "gitaly" does that, and has a blacklist of built-ins that don't support --end-of-options. This is one less special case for it and other similar programs to support. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	9a1a3a4d4c	mktag: allow omitting the header/body \n separator Change mktag's acceptance rules to accept an empty body without an empty line after the header again. This fixes an ancient unintended dregression in "mktag". When "mktag" was introduced in `ec4465adb3` (Add "tag" objects that can be used to sign other objects., 2005-04-25) the input checks were much looser. When it was documented it `6cfec03680` (mktag: minimally update the description., 2007-06-10) it was clearly intended for this \n to be optional: The message, when [it] exists, is separated by a blank line from the header. But then in `e0aaf781f6` (mktag.c: improve verification of tagger field and tests, 2008-03-27) this was made an error, seemingly by accident. It was just a result of the general header checks, and all the tests after that patch have a trailing empty line (but did not before). Let's allow this again, and tweak the test semantics changed in `e0aaf781f6` to remove the redundant empty line. New tests added in previous commits of mine already added an explicit test for allowing the empty line between header and body. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	acfc01332b	mktag: allow turning off fsck.extraHeaderEntry In earlier commits mktag learned to use the fsck machinery, at which point we needed to add fsck.extraHeaderEntry so it could be as strict about extra headers as it's been ever since it was implemented. But it's not nice to need to switch away from "mktag" to "hash-object" + manual "fsck" just because you'd like to have an extra header. So let's support turning it off by getting "fsck.*" variables from the config. Pedantically speaking it's still not possible to make "mktag" behave just like "hash-object -t tag" does, since we're unconditionally going to check the referenced object in verify_object_in_tag(), which is our own check, and not one that exists in fsck.c. But the spirit of "this works like fsck" is preserved, in that if you created such a tag with "hash-object" and did a full "fsck" on the repository it would also error out about that invalid object, it just wouldn't emit the same message as fsck does. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	1f3299fda9	fsck: make fsck_config() re-usable Move the fsck_config() function from builtin/fsck.c to fsck.[ch]. This allows for re-using it in other tools that expose fsck logic and want to support its configuration variables. A logical continuation of this change would be to use a common function for all of {fetch,receive}.fsck.* and fsck.. See `5d477a334a` (fsck (receive-pack): allow demoting errors to warnings, 2015-06-22) and my own `1362df0d41` (fetch: implement fetch.fsck., 2018-07-27) for the relevant code. However, those routines want to not parse the fsck.skipList into OIDs, but rather pass them along with the --strict option to another process. It would be possible to refactor that whole thing so we support e.g. a "fetch." prefix, then just keep track of the skiplist as a filename instead of parsing it, and learn to spew that all out from our internal structures into something we can append to the --strict option. But instead I'm planning to re-use this in "mktag", which'll just re-use these "fsck.*" variables as-is. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	acf9de4c94	mktag: use fsck instead of custom verify_tag() Change the validation logic in "mktag" to use fsck's fsck_tag() instead of its own custom parser. Curiously the logic for both dates back to the same commit[1]. Let's unify them so we're not maintaining two sets functions to verify that a tag is OK. The behavior of fsck_tag() and the old "mktag" code being removed here is different in few aspects. I think it makes sense to remove some of those checks, namely: A. fsck only cares that the timezone matches [-+][0-9]{4}. The mktag code disallowed values larger than 1400. Yes there's currently no timezone with a greater offset[2], but since we allow any number of non-offical timezones (e.g. +1234) passing this through seems fine. Git also won't break in the future if e.g. French Polynesia decides it needs to outdo the Line Islands when it comes to timezone extravagance. B. fsck allows missing author names such as "tagger <email>", mktag wouldn't, but would allow e.g. "tagger [2 spaces] <email>" (but not "tagger [1 space] <email>"). Now we allow all of these. C. Like B, but "mktag" disallowed spaces in the <email> part, fsck allows it. In some ways fsck_tag() is stricter than "mktag" was, namely: D. fsck disallows zero-padded dates, but mktag didn't care. So e.g. the timestamp "0000000000 +0000" produces an error now. A test in "t1006-cat-file.sh" relied on this, it's been changed to use "hash-object" (without fsck) instead. There was one check I deemed worth keeping by porting it over to fsck_tag(): E. "mktag" did not allow any custom headers, and by extension (as an empty commit is allowed) also forbade an extra stray trailing newline after the headers it knew about. Add a new check in the "ignore" category to fsck and use it. This somewhat abuses the facility added in `efaba7cc77` (fsck: optionally ignore specific fsck issues completely, 2015-06-22). This is somewhat of hack, but probably the least invasive change we can make here. The fsck command will shuffle these categories around, e.g. under --strict the "info" becomes a "warn" and "warn" becomes "error". Existing users of fsck's (and others, e.g. index-pack) --strict option rely on this. So we need to put something into a category that'll be ignored by all existing users of the API. Pretending that fsck.extraHeaderEntry=error ("ignore" by default) was set serves to do this for us. 1. `ec4465adb3` (Add "tag" objects that can be used to sign other objects., 2005-04-25) 2. https://en.wikipedia.org/wiki/List_of_UTC_time_offsets Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	40ef015a27	mktag: use puts(str) instead of printf("%s\n", str) This introduces no functional change, but refactors the print-out of the hash at the end to do the same thing with less code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	dfe3948728	mktag: remove redundant braces in one-line body "if" This minor stylistic churn is usually something we'd avoid, but if we don't do this then the file after changes in subsequent commits will only have this minor style inconsistency, so let's change this while we're at it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	0c439117bb	mktag: use default strbuf_read() hint Change the hardcoded hint of 2^12 to 0. The default strbuf hint is perfectly fine here, and the only reason we were hardcoding it is because it survived migration from a pre-strbuf fixed-sized buffer. See `fd17f5b5f7` (Replace all read_fd use with strbuf_read, and get rid of it., 2007-09-10) for that migration. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	692654dca0	mktag tests: test verify_object() with replaced objects Add tests to demonstrate what "mktag" does in the face of replaced objects. There was an existing test for replaced objects fed to "mktag" added in `cc400f5011` (mktag: call "check_sha1_signature" with the replacement sha1, 2009-01-23), but that one only tests a commit->commit mapping. Not a mapping to a different type as like we're also testing for here. We could remove the "mktag" test in t6050-replace.sh now if the created tag wasn't being used by a subsequent "fsck" test. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	30f882c16d	mktag tests: improve verify_object() test coverage The verify_object() function in "mktag.c" is tasked with ensuring that our tag refers to a valid object. The existing test for this might fail because it was also testing that "type taggg" didn't refer to a valid object type (it should be "type tag"), or because we referred to a valid object but got the type wrong. Let's split these tests up, so we're testing all combinations of a non-existing object and in invalid/wrong "type" lines. We need to provide GIT_TEST_GETTEXT_POISON=false here because the "invalid object type" error is emitted by parse_loose_header_extended(), which has that message already marked for translation. Another option would be to use test_i18ngrep, but I prefer always running the test, not skipping it under gettext poison testing. I'm not testing this in combination with "git replace". That'll be done in a subsequent commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	ca9a1ed969	mktag tests: test "hash-object" compatibility Change all the successful "mktag" tests to test that "hash-object" produces the same hash for the input, and that fsck passes for both. This tests e.g. that "mktag" doesn't trim its input or otherwise munge it in a way that "hash-object" doesn't. Since we're doing an "fsck --strict" here at the end let's incorporate the creation of the "mytag" name into this test, removing the special-case at the end of the file. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	47c95e77d1	mktag tests: stress test whitespace handling Add tests for a couple of whitespace edge cases around the header/body boundary. I consider the requirement for a blank line before the empty body a bug, it's a long-standing regression which goes against the command's documented behavior. This bug will be addressed in a follow-up change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	3b9e4dd3a3	mktag tests: run "fsck" after creating "mytag" Change the last test in the file to run an "fsck --strict" after creating the tag at the end. We're just doing this for good measure to check that fsck behaves as expected now that there's finally a reference for our valid tag. Other tests going to be checking this elsewhere, but it's nice to cover all the edge cases in this test to make it as self-contained as possible. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	5c2303e0c7	mktag tests: don't create "mytag" twice Change a test added in `e0aaf781f6` (mktag.c: improve verification of tagger field and tests, 2008-03-27) to not create "mytag", which should only be created and verified at the end in an earlier test added in `446c6faec6` (New tests and en-passant modifications to mktag., 2006-07-29). While we're at it let's prevent a similar logic error from creeping into the test by asserting that "mytag" doesn't exist before we create it. Let's do this by moving the test to use "update-ref", instead of our own homebrew ad-hoc refstore update. We're not really testing for anything yet by creating the tag at the end here. A subsequent commit will change that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	317c176279	mktag tests: don't redirect stderr to a file needlessly Remove the redirection of stderr to "message" in the valid tag test. This pattern seems to have been copy/pasted from the failure case in `446c6faec6` (New tests and en-passant modifications to mktag., 2006-07-29). While I'm at it do the same for the "replace" tests. The tag creation I'm changing here seems to have been copy/pasted from the "mktag" tests to those tests in `cc400f5011` (mktag: call "check_sha1_signature" with the replacement sha1, 2009-01-23). Nobody examines the contents of the resulting "message" file, so the net result is that error messages cannot be seen in "sh t3800-mktag.sh -v" output. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason	0d35ccb5e0	mktag tests: remove needless SHA-1 hardcoding Change the tests amended in `acb49d1cc8` (t3800: make hash-size independent, 2019-08-18) even more to make them independent of either SHA-1 or SHA-256. Some of these tests were failing for the wrong reasons. The first one being modified here would fail because the line starts with "xxxxxx" instead of "object", the rest of the line doesn't matter. Let's just put a valid hash on the rest of the line anyway to narrow the test down for just the s/object/xxxxxx/ case. The second one being modified here would fail under GIT_TEST_DEFAULT_HASH=sha256 because <some sha-1 length garbage> is an invalid SHA-256, but we should really be testing <some sha-256 length garbage> when under SHA-256. This doesn't really matter since we should be able to trust other parts of the code to validate things in the 0-9a-f range, but let's keep it for good measure. There's a later test which tests an invalid SHA which looks like a valid one, to stress the "We refuse to tag something we can't verify[...]" logic in mktag.c. But here we're testing for a SHA-length string which contains characters outside of the /[0-9a-f]/i set. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason	b5ca549c93	mktag tests: use "test_commit" helper Replace ad-hoc setup of a single commit in the "mktag" tests with our standard helper pattern. The old setup dated back to `446c6faec6` (New tests and en-passant modifications to mktag., 2006-07-29) before the helper existed. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason	aba5377f69	mktag tests: don't needlessly use a subshell The use of a subshell dates back to `e9b20943b7` (t/t3800: do not use a temporary file to hold expected result., 2008-01-04). It's not needed anymore, if it ever was. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason	18430ed363	mktag doc: update to explain why to use this Change the mktag documentation to compare itself to the similar "hash-object -t tag" command. Before this someone reading the documentation wouldn't have much of an idea what the difference was. Let's allude to our own validation logic, and cross-link the "mktag" and "hash-object" documentation to aid discover-ability. A follow-up change to migrate "mktag" to use "fsck" validation will make the part about validation logic clearer. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:28 -08:00
Derrick Stolee	3797a0a7b7	maintenance: use Windows scheduled tasks Git's background maintenance uses cron by default, but this is not available on Windows. Instead, integrate with Task Scheduler. Tasks can be scheduled using the 'schtasks' command. There are several command-line options that can allow for some advanced scheduling, but unfortunately these seem to all require authenticating using a password. Instead, use the "/xml" option to pass an XML file that contains the configuration for the necessary schedule. These XML files are based on some that I exported after constructing a schedule in the Task Scheduler GUI. These options only run background maintenance when the user is logged in, and more fields are populated with the current username and SID at run-time by 'schtasks'. Since the GIT_TEST_MAINT_SCHEDULER environment variable allows us to specify 'schtasks' as the scheduler, we can test the Windows-specific logic on other platforms. Thus, add a check that the XML file written by Git is valid when xmllint exists on the system. Since we use a temporary file for the XML files sent to 'schtasks', we prefix the random characters with the frequency so it is easier to examine the proper file during tests. Instead of an exact match on the 'args' file, we 'grep' for the arguments other than the filename. There is a deficiency in the current design. Windows has two kinds of applications: GUI applications that start by "winmain()" and console applications that start by "main()". Console applications are attached to a new Console window if they are not already associated with a GUI application. This means that every hour the scheudled task launches a command window for the scheduled tasks. Not only is this visually obtrusive, but it also takes focus from whatever else the user is doing! A simple fix would be to insert a GUI application that acts as a shim between the scheduled task and Git. This is currently possible in Git for Windows by setting the <Command> tag equal to C:\Program Files\Git\git-bash.exe with options "--hide --no-needs-console --command=cmd\git.exe" followed by the arguments currently used. Since git-bash.exe is not included in Windows builds of core Git, I chose to leave out this feature. My plan is to submit a small patch to Git for Windows that converts the use of git.exe with this use of git-bash.exe in the short term. In the long term, we can consider creating this GUI shim application within core Git, perhaps in contrib/. Co-authored-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:38:02 -08:00
Derrick Stolee	2afe7e3567	maintenance: use launchctl on macOS The existing mechanism for scheduling background maintenance is done through cron. The 'crontab -e' command allows updating the schedule while cron itself runs those commands. While this is technically supported by macOS, it has some significant deficiencies: 1. Every run of 'crontab -e' must request elevated privileges through the user interface. When running 'git maintenance start' from the Terminal app, it presents a dialog box saying "Terminal.app would like to administer your computer. Administration can include modifying passwords, networking, and system settings." This is more alarming than what we are hoping to achieve. If this alert had some information about how "git" is trying to run "crontab" then we would have some reason to believe that this dialog might be fine. However, it also doesn't help that some scenarios just leave Git waiting for a response without presenting anything to the user. I experienced this when executing the command from a Bash terminal view inside Visual Studio Code. 2. While cron initializes a user environment enough for "git config --global --show-origin" to show the correct config file information, it does not set up the environment enough for Git Credential Manager Core to load credentials during a 'prefetch' task. My prefetches against private repositories required re-authenticating through UI pop-ups in a way that should not be required. The solution is to switch from cron to the Apple-recommended [1] 'launchd' tool. [1] https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/ScheduledJobs.html The basics of this tool is that we need to create XML-formatted "plist" files inside "~/Library/LaunchAgents/" and then use the 'launchctl' tool to make launchd aware of them. The plist files include all of the scheduling information, along with the command-line arguments split across an array of <string> tags. For example, here is my plist file for the weekly scheduled tasks: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"><dict> <key>Label</key><string>org.git-scm.git.weekly</string> <key>ProgramArguments</key> <array> <string>/usr/local/libexec/git-core/git</string> <string>--exec-path=/usr/local/libexec/git-core</string> <string>for-each-repo</string> <string>--config=maintenance.repo</string> <string>maintenance</string> <string>run</string> <string>--schedule=weekly</string> </array> <key>StartCalendarInterval</key> <array> <dict> <key>Day</key><integer>0</integer> <key>Hour</key><integer>0</integer> <key>Minute</key><integer>0</integer> </dict> </array> </dict> </plist> The schedules for the daily and hourly tasks are more complicated since we need to use an array for the StartCalendarInterval with an entry for each of the six days other than the 0th day (to avoid colliding with the weekly task), and each of the 23 hours other than the 0th hour (to avoid colliding with the daily task). The "Label" value is currently filled with "org.git-scm.git.X" where X is the frequency. We need a different plist file for each frequency. The launchctl command needs to be aligned with a user id in order to initialize the command environment. This must be done using the 'launchctl bootstrap' subcommand. This subcommand is new as of macOS 10.11, which was released in September 2015. Before that release the 'launchctl load' subcommand was recommended. The best source of information on this transition I have seen is available at [2]. The current design does not preclude a future version that detects the available fatures of 'launchctl' to use the older commands. However, it is best to rely on the newest version since Apple might completely remove the deprecated version on short notice. [2] https://babodee.wordpress.com/2016/04/09/launchctl-2-0-syntax/ To remove a schedule, we must run 'launchctl bootout' with a valid plist file. We also need to 'bootout' a task before the 'bootstrap' subcommand will succeed, if such a task already exists. The need for a user id requires us to run 'id -u' which works on POSIX systems but not Windows. Further, the need for fully-qualitifed path names including $HOME behaves differently in the Git internals and the external test suite. The $HOME variable starts with "C:\..." instead of the "/c/..." that is provided by Git in these subcommands. The test therefore has a prerequisite that we are not on Windows. The cross- platform logic still allows us to test the macOS logic on a Linux machine. We can verify the commands that were run by 'git maintenance start' and 'git maintenance stop' by injecting a script that writes the command-line arguments into GIT_TEST_MAINT_SCHEDULER. An earlier version of this patch accidentally had an opening "<dict>" tag when it should have had a closing "</dict>" tag. This was caught during manual testing with actual 'launchctl' commands, but we do not want to update developers' tasks when running tests. It appears that macOS includes the "xmllint" tool which can verify the XML format. This is useful for any system that might contain the tool, so use it whenever it is available. We strive to make these tests work on all platforms, but Windows caused some headaches. In particular, the value of getuid() called by the C code is not guaranteed to be the same as `$(id -u)` invoked by a test. This is because `git.exe` is a native Windows program, whereas the utility programs run by the test script mostly utilize the MSYS2 runtime, which emulates a POSIX-like environment. Since the purpose of the test is to check that the input to the hook is well-formed, the actual user ID is immaterial, thus we can work around the problem by making the the test UID-agnostic. Another subtle issue is the $HOME environment variable being a Windows-style path instead of a Unix-style path. We can be more flexible here instead of expecting exact path matches. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Co-authored-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:38:02 -08:00
Felipe Contreras	5a067ba9d0	completion: add proper public __git_complete When __git_complete was introduced, it was meant to be temporarily, while a proper guideline for public shell functions was established (tentatively _GIT_complete), but since that never happened, people in the wild started to use __git_complete, even though it was marked as not public. Eight years is more than enough wait, let's mark this function as public, and make it a bit more user-friendly. So that instead of doing: __git_complete gk __gitk_main The user can do: __git_complete gk gitk And instead of: __git_complete gf _git_fetch Do: __git_complete gf git_fetch Backwards compatibility is maintained. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 15:25:56 -08:00
Felipe Contreras	0e02bdc17a	test: completion: add tests for __git_complete Even though the function was marked as not public, it's already used in the wild. We should at least test basic functionality. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 15:25:56 -08:00
Felipe Contreras	810df0ea8e	completion: bash: improve function detection 1. We should quote the argument 2. We don't need two redirections 3. A safeguard for arguments (-a) would be good Suggested-by: René Scharfe <l.s.r@web.de> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 15:25:56 -08:00
Felipe Contreras	7f94b78dda	completion: bash: add __git_have_func helper This makes the code more readable, and also will help when new code wants to do similar checks. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 15:25:56 -08:00
Derrick Stolee	fa7ca5d4fe	cache-tree: use trace2 in cache_tree_update() This matches a trace_performance_enter()/trace_performance_leave() pair added by `0d1ed59` (unpack-trees: add performance tracing, 2018-08-18). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 15:23:08 -08:00
Derrick Stolee	c338898a47	unpack-trees: add trace2 regions The unpack_trees() method is quite complicated and its performance can change dramatically depending on how it is used. We already have some performance tracing regions, but they have not been updated to the trace2 API. Do so now. We already have trace2 regions in unpack_trees.c:clear_ce_flags(), which uses a linear scan through the index without recursing into trees. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 15:23:08 -08:00
Derrick Stolee	da8be8ced6	tree-walk: report recursion counts The traverse_trees() method recursively walks through trees, but also prunes the tree-walk based on a callback. Some callers, such as unpack_trees(), are quite complicated and can have wildly different performance between two different commands. Create constants that count these values and then report the results at the end of a process. These counts are cumulative across multiple "root" instances of traverse_trees(), but they provide reproducible values for demonstrating improvements to the pruning algorithm when possible. This change is modeled after a similar statistics reporting in `42e50e78` (revision.c: add trace2 stats around Bloom filter usage, 2020-04-06). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 15:23:08 -08:00
Derrick Stolee	90b666da60	revision: trace topo-walk statistics We trace statistics about the effectiveness of changed-path Bloom filters since `42e50e78` (revision.c: add trace2 stats around Bloom filter usage, 2020-04-06). Add similar tracing for the topo-walk algorithm that uses generation numbers to limit the walk size. This information can help investigate and describe benefits to heuristics and other changes. The information that is printed is in JSON format and can be formatted nicely to present as follows: { "count_explort_walked":2603, "count_indegree_walked":2603, "count_topo_walked":473 } Each of these values count the number of commits are visited by each of the three "stages" of the topo-walk as detailed in `b4542418` (revision.c: generation-based topo-order algorithm, 2018-11-01). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 15:18:22 -08:00
Martin Ågren	bc62692757	hash-lookup: rename from sha1-lookup Change all remnants of "sha1" in hash-lookup.c and .h and rename them to reflect that we're not just able to handle SHA-1 these days. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 13:01:55 -08:00
Martin Ågren	7a7d992d0d	sha1-lookup: rename `sha1_pos()` as `hash_pos()` Rename this function to reflect that we're not just able to handle SHA-1 these days. There are a few instances of "sha1" left in sha1-lookup.[ch] after this, but those will be addressed in the next commit. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 13:01:55 -08:00
Martin Ågren	e5afd4449d	object-file.c: rename from sha1-file.c Drop the last remnant of "sha1" in this file and rename it to reflect that we're not just able to handle SHA-1 these days. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 13:01:55 -08:00
Martin Ågren	1e6771e504	object-name.c: rename from sha1-name.c Generalize the last remnants of "sha" and "sha1" in this file and rename it to reflect that we're not just able to handle SHA-1 these days. We need to update one test to check for an updated error string. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 13:01:55 -08:00
Elijah Newren	350410f6b1	diffcore-rename: remove unnecessary duplicate entry checks Commit `25d5ea410f` ("[PATCH] Redo rename/copy detection logic.", 2005-05-24) added a duplicate entry check on rename_src in order to avoid segfaults; the code at the time was prone to double free()s and an easy way to avoid it was just to turn off rename detection for any duplicate entries. Note that the form of the check was modified two commits ago in this series. Similarly, commit `4d6be03b95` ("diffcore-rename: avoid processing duplicate destinations", 2015-02-26) added a duplicate entry check on rename_dst for the exact same reason -- the code was prone to double free()s, and an easy way to avoid it was just to turn off rename detection entirely. Note that the form of the check was modified in the commit just before this one. In the original code in both places, the code was dealing with individual diff_filespecs and trying to match things up, instead of just keeping the original diff_filepairs around as we do now. The intervening change in structure has fixed the accounting problems and the associated double free()s that used to occur, and thus we already have a better fix. As such, we can remove the band-aid checks for duplicate entries. Due to the last two patches, the diffcore_rename() setup is no longer a sizeable chunk of overall runtime. Thus, in a large rebase of many commits with lots of renames and several optimizations to inexact rename detection, this patch only speeds up the overall code by about half a percent or so and is pretty close to the run-to-run variability making it hard to get an exact measurement. However, with some trace2 regions around the setup code in diffcore_rename() so that I can focus on just it, I measure that this patch consistently saves almost a third of the remaining time spent in diffcore_rename() setup. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 12:59:34 -08:00
Elijah Newren	4ef88fc3a8	merge-ort: add handling for different types of files at same path Add some handling that explicitly considers collisions of the following types: * file/submodule * file/symlink * submodule/symlink Leaving them as conflicts at the same path are hard for users to resolve, so move one or both of them aside so that they each get their own path. Note that in the case of recursive handling (i.e. call_depth > 0), we can just use the merge base of the two merge bases as the merge result much like we do with modify/delete conflicts, binary files, conflicting submodule values, and so on. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Elijah Newren	4204cd591b	merge-ort: copy find_first_merges() implementation from merge-recursive.c Code is identical for the function body in the two files, the call signature is just slightly different in merge-ort than merge-recursive as noted a couple commits ago. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Elijah Newren	70f19c7fce	merge-ort: implement format_commit() This implementation is based on a mixture of print_commit() and output_commit_title() from merge-recursive.c so that it can be used to take over both functions. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Elijah Newren	c73cda76b1	merge-ort: copy and adapt merge_submodule() from merge-recursive.c Take merge_submodule() from merge-recursive.c and make slight adjustments, predominantly around deferring output using path_msg() instead of using merge-recursive's output() and show() functions. There's also a fix for recursive cases (when call_depth > 0) and a slight change to argument order for find_first_merges(). find_first_merges() and format_commit() are left unimplemented for now, but will be added by subsequent commits. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Elijah Newren	f591c47246	merge-ort: copy and adapt merge_3way() from merge-recursive.c Take merge_3way() from merge-recursive.c and make slight adjustments based on different data structures (direct usage of object_id rather diff_filespec, separate pathnames which based on our careful interning of pathnames in opt->priv->paths can be compared with '!=' rather than 'strcmp'). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Elijah Newren	62fdec17a1	merge-ort: flesh out implementation of handle_content_merge() This implementation is based heavily on merge_mode_and_contents() from merge-recursive.c, though it has some fixes for recursive merges (i.e. when call_depth > 0), and has a number of changes throughout based on slight differences in data structures and in how the functions are called. It is, however, based on two new helper functions -- merge_3way() and merge_submodule -- for which we only provide die-not-implemented stubs at this point. Future commits will add implementations of these functions. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Elijah Newren	991bbdcab9	merge-ort: handle book-keeping around two- and three-way content merge In addition to the content merge (which will go in a subsequent commit), we need to worry about conflict messages, placing results in higher order stages in case of a df_conflict, and making sure the results are placed in ci->merged.result so that they will show up in the working tree. Take care of all that external book-keeping, moving the simplistic just-take-HEAD code into the barebones handle_content_merge() function for now. Subsequent commits will flesh out handle_content_merge(). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Elijah Newren	5a1a1e8ea9	merge-ort: implement unique_path() helper Implement unique_path(), based on the one from merge-recursive.c. It is simplified, however, due to: (1) using strmaps, and (2) the fact that merge-ort lets the checkout codepath handle possible collisions with the working tree means that other code locations don't have to. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Elijah Newren	23366d2aa9	merge-ort: handle directory/file conflicts that remain When a directory/file conflict remains, we can leave the directory where it is, but need to move the information about the file to a different pathname. After moving the file to a different pathname, we allow subsequent process_entry() logic to handle any additional details that might be relevant. This depends on a new helper function, unique_path(), that dies with an unimplemented error currently but will be implemented in a subsequent commit. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Elijah Newren	0ccfa4e5d8	merge-ort: handle D/F conflict where directory disappears due to merge When one side has a directory at a given path and the other side of history has a file at the path, but the merge resolves the directory away (e.g. because no path within that directory was modified and the other side deleted it, or because renaming moved all the files elsewhere), then we don't actually have a conflict anymore. We just need to clear away any information related to the relevant directory, and then the subsequent process_entry() handling can handle the given path. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 10:40:45 -08:00
Junio C Hamano	ffd27e6cb2	CoC: explicitly take any whitespace breakage We'll keep this document mostly in sync with the upstream; let's help "git am" and "git show" by telling them that they may introduce what we may consider whitespace errors. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 09:44:49 -08:00
Ævar Arnfjörð Bjarmason	cb50786f49	CoC: Update word-wrapping to match upstream When the CoC document was added in `5cdf2301d4` (add a Code of Conduct document, 2019-09-24) it was added from some 1.4 version of the document whose word wrapping doesn't match what's currently at [1], which matches content/version/1/4/code-of-conduct.md in the CoC repository[2]. Let's update our version to match that, to make reading subsequent diffs easier. There are no non-whitespace changes here. 1. https://www.contributor-covenant.org/version/1/4/code-of-conduct/ 2. https://github.com/ContributorCovenant/contributor_covenant Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 09:14:38 -08:00
Eric Wong	a9ecaa06a7	core.abbrev=no disables abbreviations This allows users to write hash-agnostic scripts and configs by disabling abbreviations. Using "-c core.abbrev=40" will be insufficient with SHA-256, and "-c core.abbrev=64" won't work with SHA-1 repos today. Signed-off-by: Eric Wong <e@80x24.org> [jc: tweaked implementation, added doc and a test] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-23 13:40:09 -08:00
Ævar Arnfjörð Bjarmason	9ce0fc3311	mktag doc: grammar fix, when exists -> when it exists Amend the wording of documentation added in `6cfec03680` (mktag: minimally update the description., 2007-06-10). It makes more sense to say "when it exists" here, as we're referring to "the message". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-22 17:49:05 -08:00
Ævar Arnfjörð Bjarmason	f59b61dc4d	mktag doc: say <hash> not <sha1> Change the "mktag" documentation to refer to the input hash as just "hash", not "sha1". This command has supported SHA-256 for a while now. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-22 17:49:05 -08:00
Eric Sunshine	5bc12c11cc	t/perf: avoid unnecessary test_export() recursion test_export() has been self-recursive since its inception even though a simple for-loop would have served just as well to append its arguments to the `test_export_` variable separated by the pipe character "\|". Recently `test_export_` was changed instead to a space-separated list of tokens to be exported, an operation which can be accomplished via a single simple assignment, with no need for looping or recursion. Therefore, simplify the implementation. While at it, take advantage of the fact that variable names to be exported are shell identifiers, thus won't be composed of special characters or whitespace, thus simple a `$*` can be used rather than magical `"$@"`. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-22 13:45:36 -08:00
Sergey Organov	af04d8f1a5	t4013: add tests for --diff-merges=first-parent This new option provides essential new functionality, changing diff output to first parent only, without changing history traversal mode, so it deserves its own test. As we do it, add additional test that --diff-merges=first-parent by itself doesn't imply -p and only outputs diffs for merge commits. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	1d24509b7b	doc/git-show: include --diff-merges description Move description of --diff-merges option from git-log.txt to diff-options.txt so that it is included in the git-show help. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	e58142add4	doc/rev-list-options: document --first-parent changes merges format After introduction of the --diff-merges=first-parent, the --first-parent sets the default format for merges to the same value as this new option. Document this behavior and add corresponding reference to --diff-merges. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	8efd2efc32	doc/diff-generate-patch: mention new --diff-merges option Mention --diff-merges instead of -m in a note to merge formats to aid discoverability, as -m is now described among --diff-merges options anyway. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	b5ffa9ec10	doc/git-log: describe new --diff-merges options Describe all the new --diff-merges options in the git-log.txt and adopt description of originals accordingly. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	388091fe4d	diff-merges: add '--diff-merges=1' as synonym for 'first-parent' As we now have --diff-merges={m\|c\|cc}, add --diff-merges=1 as synonym for --diff-merges=first-parent, to have shorter mnemonics for it as well. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	5071c75316	diff-merges: add old mnemonic counterparts to --diff-merges This adds --diff-merges={m\|c\|cc} values that match mnemonics of old options, for those who are used to them. Note that, say, --diff-meres=cc behaves differently than --cc, as the latter implies -p and therefore enables diffs for all the commits, while the former enables output of diffs for merge commits only. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	a6d19ecc6b	diff-merges: let new options enable diff without -p New options don't have any visible effect unless -p is either given or implied, as unlike -c/-cc we don't imply -p with --diff-merges. To fix this, this patch adds new functionality by letting new options enable output of diffs for merge commits only. Add 'merges_need_diff' field and set it whenever diff output for merges is enabled by any of the new options. Extend diff output logic accordingly, to output diffs for merges when 'merges_need_diff' is set even when no -p has been provided. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	5733b20f41	diff-merges: do not imply -p for new options Add 'combined_imply_patch' field and set it only for old --cc/-c options, then imply -p if this flag is set instead of implying -p whenever 'combined_merge' flag is set. We don't want new --diff-merge options to imply -p, to make it possible to enable output of diffs for merges independently from non-merge commits. At the same time we want to preserve behavior of old --c/-c/-m options and their interactions with --first-parent, to stay backward-compatible. This patch is first step in this direction: it separates old "--cc/-c imply -p" logic from the rest of the options. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	8c0ba528bc	diff-merges: implement new values for --diff-merges We first implement new options as exact synonyms for their original counterparts, to get all the infrastructure right, and keep functional improvements for later commits. The following values are implemented: --diff-merges= old equivalent first\|first-parent = --first-parent (only format implications) sep\|separate = -m comb\|combined = -c dense\| dense-combined = --cc Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:32 -08:00
Sergey Organov	255a4dacc5	diff-merges: make -m/-c/--cc explicitly mutually exclusive -c/--cc got precedence over -m only because of external logic where corresponding flags are checked before that for -m. This is too error-prone, so add code that explicitly makes these 3 options mutually exclusive, so that the last option specified on the command-line gets precedence. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	3d2b5f2f49	diff-merges: refactor opt settings into separate functions To prepare introduction of new options some of which will be synonyms to existing options, let every option handling code just call corresponding function. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	a6e66af923	diff-merges: get rid of now empty diff_merges_init_revs() After getting rid of 'ignore_merges' field, the diff_merges_init_revs() function became empty. Get rid of it. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	d9b1bc6d13	diff-merges: group diff-merge flags next to each other inside 'rev_info' The relevant flags were somewhat scattered over definition of 'struct rev_info'. Rearrange them to group them together. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	1a2c4d8050	diff-merges: split 'ignore_merges' field 'ignore_merges' was 3-way field that served two distinct purposes that we now assign to 2 new independent flags: 'separate_merges', and 'explicit_diff_merges'. 'separate_merges' tells that we need to output diff format containing separate diff for every parent (as opposed to 'combine_merges'). 'explicit_diff_merges' tells that at least one of diff-merges options has been explicitly specified on the command line, so no defaults should apply. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	6fc944d895	diff-merges: fix -m to properly override -c/--cc Logically, -m, -c, --cc specify 3 different formats for representing merge commits, yet -m doesn't in fact override -c or --cc, that makes no sense. Fix -m to properly override -c/--cc, and change the tests accordingly. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	ec315c66bb	t4013: add tests for -m failing to override -c/--cc Logically, -m, -c, --cc specify 3 different formats for representing merge commits, yet -m doesn't in fact override -c or --cc, that makes no sense. Add 2 expected to fail tests that demonstrate the problem. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	14c14b44e4	t4013: support test_expect_failure through ':failure' magic Add support to be able to specify expected failure, through :failure magic, like this: :failure cmd args Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	e121b4b822	diff-merges: revise revs->diff flag handling Do not set revs->diff when we encounter an option that needs it, as it'd be impossible to undo later. Besides, some other options than what we handle here set this flag as well, and we'd interfere with them trying to clear this flag later. Rather set revs->diff, if finally needed, in diff_merges_setup_revs(). As an additional bonus, this also makes our code shorter. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	0c627f5d3c	diff-merges: handle imply -p on -c/--cc logic for log.c Move logic that handles implying -p on -c/--cc from log_setup_revisions_tweak() to diff_merges_setup_revs(), where it belongs. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	3291eea310	diff-merges: introduce revs->first_parent_merges flag This new field allows us to separate format of diff for merges from 'first_parent_only' flag which primary purpose is limiting history traversal. This change further localizes diff format selection logic into the diff-merges.c file. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	3b6c17b5c0	diff-merges: new function diff_merges_set_dense_combined_if_unset() Call it where given functionality is needed instead of direct checking/tweaking of diff merges related fields. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	09322b1da9	diff-merges: new function diff_merges_suppress() This function sets all the relevant flags to disabled state, so that no code that checks only one of them get it wrong. Then we call this new function everywhere where diff merges output suppression is needed. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	564a4fc847	diff-merges: re-arrange functions to match the order they are called in For clarity, define public functions in the order they are called, to make logic inter-dependencies easier to grok. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	4f54544d73	diff-merges: rename diff_merges_default_to_enable() to match semantics Rename diff_merges_default_to_enable() to diff_merges_default_to_first_parent() to match its semantics. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	7acf0d06f5	diff-merges: move checks for first_parent_only out of the module The checks for first_parent_only don't in fact belong to this module, as the primary purpose of this flag is history traversal limiting, so get it out of this module and rename the diff_merges_first_parent_defaults_to_enable() to diff_merges_default_to_enable() to match new semantics. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Sergey Organov	18f09473bf	diff-merges: rename all functions to have common prefix Use the same "diff_merges" prefix for all the diff merges function names. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Sergey Organov	a37eec6333	revision: move diff merges functions to its own diff-merges.c Create separate diff-merges.c and diff-merges.h files, and move all the code related to handling of diff merges there. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Sergey Organov	3d4fd94363	revision: provide implementation for diff merges tweaks Use these implementations from show_setup_revisions_tweak() and log_setup_revisions_tweak() in builtin/log.c. This completes moving of management of diff merges parameters to a single place, where we can finally observe them simultaneously. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Sergey Organov	027c4783d7	revision: factor out initialization of diff-merge related settings Move initialization code related to diffing merges into new init_diff_merge_revs() function. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Sergey Organov	299a663440	revision: factor out setup of diff-merge related settings Move all the setting code related to diffing merges into new setup_diff_merge_revs() function. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Sergey Organov	891e417cbc	revision: factor out parsing of diff-merge related options Move all the parsing code related to diffing merges into new parse_diff_merge_opts() function. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Eric Sunshine	cf76baea41	worktree: teach `repair` to fix multi-directional breakage `git worktree repair` knows how to repair the two-way links between the repository and a worktree as long as a link in one or the other direction is sound. For instance, if a linked worktree is moved (without using `git worktree move`), repair is possible because the worktree still knows the location of the repository even though the repository no longer knows where the worktree is. Similarly, if the repository is moved, repair is possible since the repository still knows the locations of the worktrees even though the worktrees no longer know where the repository is. However, if both the repository and the worktrees are moved, then links are severed in both directions, and no repair is possible. This is the case even when the new worktree locations are specified as arguments to `git worktree repair`. The reason for this limitation is twofold. First, when `repair` consults the worktree's gitfile (/path/to/worktree/.git) to determine the corresponding <repo>/worktrees/<id>/gitdir file to fix, <repo> is the old path to the repository, thus it is unable to fix the `gitdir` file at its new location since it doesn't know where it is. Second, when `repair` consults <repo>/worktrees/<id>/gitdir to find the location of the worktree's gitfile (/path/to/worktree/.git), the path recorded in `gitdir` is the old location of the worktree's gitfile, thus it is unable to repair the gitfile since it doesn't know where it is. Fix these shortcomings by teaching `repair` to attempt to infer the new location of the <repo>/worktrees/<id>/gitdir file when the location recorded in the worktree's gitfile has become stale but the file is otherwise well-formed. The inference is intentionally simple-minded. For each worktree path specified as an argument, `git worktree repair` manually reads the ".git" gitfile at that location and, if it is well-formed, extracts the <id>. It then searches for a corresponding <id> in <repo>/worktrees/ and, if found, concludes that there is a reasonable match and updates <repo>/worktrees/<id>/gitdir to point at the specified worktree path. In order for <repo> to be known, `git worktree repair` must be run in the main worktree or bare repository. `git worktree repair` first attempts to repair each incoming /path/to/worktree/.git gitfile to point at the repository, and then attempts to repair outgoing <repo>/worktrees/<id>/gitdir files to point at the worktrees. This sequence was chosen arbitrarily when originally implemented since the order of fixes is immaterial as long as one side of the two-way link between the repository and a worktree is sound. However, for this new repair technique to work, the order must be reversed. This is because the new inference mechanism, when it is successful, allows the outgoing <repo>/worktrees/<id>/gitdir file to be repaired, thus fixing one side of the two-way link. Once that side is fixed, the other side can be fixed by the existing repair mechanism, hence the order of repairs is now significant. Two safeguards are employed to avoid hijacking a worktree from a different repository if the user accidentally specifies a foreign worktree as an argument. The first, as described above, is that it requires an <id> match between the repository and the worktree. That itself is not foolproof for preventing hijack, so the second safeguard is that the inference will only kick in if the worktree's /path/to/worktree/.git gitfile does not point at a repository. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:44:28 -08:00
Elijah Newren	8119214f4e	merge-ort: implement merge_incore_recursive() Implement merge_incore_recursive(), mostly through the use of a new helper function, merge_ort_internal(), which itself is based off merge_recursive_internal() from merge-recursive.c. This drops the number of failures in the testsuite when run under GIT_TEST_MERGE_ALGORITHM=ort from around 1500 to 647. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-16 21:56:39 -08:00
Elijah Newren	43e9c4eecc	merge-ort: make clear_internal_opts() aware of partial clearing In order to handle recursive merges, after merging merge-bases we need to clear away most of the data we had built up but some of it needs to be kept -- in particular the "output" field. Rename the function to reflect its future change in use. Further, since "reinitialize" means we'll be reusing the fields immediately, take advantage of this to only partially clear maps, leaving the hashtable allocated and pre-sized. (This may be slightly out-of-order since the speedups aren't realized until there are far more strmaps in use, but the patch submission process already went out of order because of various questions and requests for strmap. Anyway, see commit `6ccdfc2a20` ("strmap: enable faster clearing and reusing of strmaps", 2020-11-05), for performance details about the use of strmap_partial_clear().) Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-16 21:56:39 -08:00
Elijah Newren	4296d8f17d	merge-ort: copy a few small helper functions from merge-recursive.c In a subsequent commit, we will implement the traditional recursiveness that gave merge-recursive its name, namely merging non-unique merge-bases to come up with a single virtual merge base. Copy a few helper functions from merge-recursive.c that we will use in the implementation. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-16 21:56:39 -08:00
Elijah Newren	b0ca120554	commit: move reverse_commit_list() from merge-recursive Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-16 21:56:39 -08:00
Felipe Contreras	c525de335e	pull: display default warning only when non-ff There's no need to display the annoying warning on every pull... only the ones that are not fast-forward. The current warning tests still pass, but not because of the arguments or the configuration, but because they are all fast-forward. We need to test non-fast-forward situations now. Suggestions-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:39:42 -08:00
Junio C Hamano	7539fdc629	pull: correct condition to trigger non-ff advice Refactor the advise() call that teaches users how they can choose between merge and rebase into a helper function. This revealed that the caller's logic needs to be further clarified to allow future actions (like "erroring out" instead of the current "go ahead and merge anyway") that should happen whether the advice message is squelched out. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:39:42 -08:00
Junio C Hamano	b044db9172	pull: get rid of unnecessary global variable It is easy enough to do, and gives a more descriptive name to the variable that is scoped in a more focused way. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:39:17 -08:00
Elijah Newren	6fcccbd755	merge-ort: add implementation of type-changed rename handling Implement cases where renames are involved in type changes (i.e. the side of history that didn't rename the file changed its type from a regular file to a symlink or submodule). There was some code to handle this in merge-recursive but only in the special case when the renamed file had no content changes. The code here works differently -- it knows process_entry() can handle mode conflicts, so it does a few minimal tweaks to ensure process_entry() can just finish the job as needed. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:18:32 -08:00
Elijah Newren	f1665e6918	merge-ort: add implementation of normal rename handling Implement handling of normal renames. This code replaces the following from merge-recurisve.c: * the code relevant to RENAME_NORMAL in process_renames() * the RENAME_NORMAL case of process_entry() Also, there is some shared code from merge-recursive.c for multiple different rename cases which we will no longer need for this case (or other rename cases): * handle_rename_normal() * setup_rename_conflict_info() The consolidation of four separate codepaths into one is made possible by a change in design: process_renames() tweaks the conflict_info entries within opt->priv->paths such that process_entry() can then handle all the non-rename conflict types (directory/file, modify/delete, etc.) orthogonally. This means we're much less likely to miss special implementation of some kind of combination of conflict types (see commits brought in by `66c62eaec6` ("Merge branch 'en/merge-tests'", 2020-11-18), especially commit `ef52778708` ("merge tests: expect improved directory/file conflict handling in ort", 2020-10-26) for more details). That, together with letting worktree/index updating be handled orthogonally in the merge_switch_to_result() function, dramatically simplifies the code for various special rename cases. (To be fair, the code for handling normal renames wasn't all that complicated beforehand, but it's still much simpler now.) Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:18:32 -08:00
Elijah Newren	35e47e3514	merge-ort: add implementation of rename collisions Implement rename/rename(2to1) and rename/add handling, i.e. a file is renamed into a location where another file is added (with that other file either being a plain add or itself coming from a rename). Note that rename collisions can also have a special case stacked on top: the file being renamed on one side of history is deleted on the other (yielding either a rename/add/delete conflict or perhaps a rename/rename(2to1)/delete[/delete]) conflict. One thing to note here is that when there is a double rename, the code in question only handles one of them at a time; a later iteration through the loop will handle the other. After they've both been handled, process_entry()'s normal add/add code can handle the collision. This code replaces the following from merge-recurisve.c: * all the 2to1 code in process_renames() * the RENAME_TWO_FILES_TO_ONE case of process_entry() * handle_rename_rename_2to1() * handle_rename_add() Also, there is some shared code from merge-recursive.c for multiple different rename cases which we will no longer need for this case (or other rename cases): * handle_file_collision() * setup_rename_conflict_info() The consolidation of six separate codepaths into one is made possible by a change in design: process_renames() tweaks the conflict_info entries within opt->priv->paths such that process_entry() can then handle all the non-rename conflict types (directory/file, modify/delete, etc.) orthogonally. This means we're much less likely to miss special implementation of some kind of combination of conflict types (see commits brought in by `66c62eaec6` ("Merge branch 'en/merge-tests'", 2020-11-18), especially commit `ef52778708` ("merge tests: expect improved directory/file conflict handling in ort", 2020-10-26) for more details). That, together with letting worktree/index updating be handled orthogonally in the merge_switch_to_result() function, dramatically simplifies the code for various special rename cases. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:18:32 -08:00
Elijah Newren	2e91ddd24e	merge-ort: add implementation of rename/delete conflicts Implement rename/delete conflicts, i.e. one side renames a file and the other deletes the file. This code replaces the following from merge-recurisve.c: * the code relevant to RENAME_DELETE in process_renames() * the RENAME_DELETE case of process_entry() * handle_rename_delete() Also, there is some shared code from merge-recursive.c for multiple different rename cases which we will no longer need for this case (or other rename cases): * handle_change_delete() * setup_rename_conflict_info() The consolidation of five separate codepaths into one is made possible by a change in design: process_renames() tweaks the conflict_info entries within opt->priv->paths such that process_entry() can then handle all the non-rename conflict types (directory/file, modify/delete, etc.) orthogonally. This means we're much less likely to miss special implementation of some kind of combination of conflict types (see commits brought in by `66c62eaec6` ("Merge branch 'en/merge-tests'", 2020-11-18), especially commit `ef52778708` ("merge tests: expect improved directory/file conflict handling in ort", 2020-10-26) for more details). That, together with letting worktree/index updating be handled orthogonally in the merge_switch_to_result() function, dramatically simplifies the code for various special rename cases. To be fair, there is a _slight_ tweak to process_entry() here, because rename/delete cases will also trigger the modify/delete codepath. However, we only want a modify/delete message to be printed for a rename/delete conflict if there is a content change in the renamed file in addition to the rename. So process_renames() and process_entry() aren't quite fully orthogonal, but they are pretty close. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:18:32 -08:00
Elijah Newren	53e88a0353	merge-ort: add implementation of both sides renaming differently Implement rename/rename(1to2) handling, i.e. both sides of history renaming a file and rename it differently. This code replaces the following from merge-recurisve.c: * all the 1to2 code in process_renames() * the RENAME_ONE_FILE_TO_TWO case of process_entry() * handle_rename_rename_1to2() Also, there is some shared code from merge-recursive.c for multiple different rename cases which we will no longer need for this case (or other rename cases): * handle_file_collision() * setup_rename_conflict_info() The consolidation of five separate codepaths into one is made possible by a change in design: process_renames() tweaks the conflict_info entries within opt->priv->paths such that process_entry() can then handle all the non-rename conflict types (directory/file, modify/delete, etc.) orthogonally. This means we're much less likely to miss special implementation of some kind of combination of conflict types (see commits brought in by `66c62eaec6` ("Merge branch 'en/merge-tests'", 2020-11-18), especially commit `ef52778708` ("merge tests: expect improved directory/file conflict handling in ort", 2020-10-26) for more details). That, together with letting worktree/index updating be handled orthogonally in the merge_switch_to_result() function, dramatically simplifies the code for various special rename cases. To be fair, there is a _slight_ tweak to process_entry() here to make sure that the two different paths aren't marked as clean but are left in a conflicted state. So process_renames() and process_entry() aren't quite entirely orthogonal, but they are pretty close. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:18:32 -08:00
Elijah Newren	af1e56c49e	merge-ort: add implementation of both sides renaming identically Implement rename/rename(1to1) handling, i.e. both sides of history renaming a file but renaming the same way. This code replaces the following from merge-recurisve.c: * all the 1to1 code in process_renames() * the RENAME_ONE_FILE_TO_ONE case of process_entry() Also, there is some shared code from merge-recursive.c for multiple different rename cases which we will no longer need for this case (or other rename cases): * handle_rename_normal() * setup_rename_conflict_info() The consolidation of four separate codepaths into one is made possible by a change in design: process_renames() tweaks the conflict_info entries within opt->priv->paths such that process_entry() can then handle all the non-rename conflict types (directory/file, modify/delete, etc.) orthogonally. This means we're much less likely to miss special implementation of some kind of combination of conflict types (see commits brought in by `66c62eaec6` ("Merge branch 'en/merge-tests'", 2020-11-18), especially commit `ef52778708` ("merge tests: expect improved directory/file conflict handling in ort", 2020-10-26) for more details). That, together with letting worktree/index updating be handled orthogonally in the merge_switch_to_result() function, dramatically simplifies the code for various special rename cases. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:18:32 -08:00
Junio C Hamano	c3b58472be	pack-redundant: gauge the usage before proposing its removal The subcommand is unusably slow and the reason why nobody reports it as a performance bug is suspected to be the absense of users. Let's show a big message that asks the user to tell us that they still care about the command when an attempt is made to run the command, with an escape hatch to override it with a command line option. In a few releases, we may turn it into an error and keep it for a few more releases before finally removing it (during the whole time, the plan to remove it would be interrupted by end user raising hand). Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 14:30:11 -08:00
Elijah Newren	9db2ac5616	diffcore-rename: accelerate rename_dst setup register_rename_src() simply references the passed pair inside rename_src. In contrast, add_rename_dst() did something entirely different for rename_dst. Instead of copying the passed pair, it made a copy of the second diff_filespec from the passed pair, referenced it, and then set the diff_rename_dst.pair field to NULL. Later, when a pairing is found, record_rename_pair() allocated a full diff_filepair via diff_queue() and pointed its src and dst fields at the appropriate diff_filespecs. This contrast between register_rename_src() for the rename_src data structure and add_rename_dst() for the rename_dst data structure is oddly inconsistent and requires more memory and work than necessary. Let's just reference the original diff_filepair in rename_dst as-is, just as we do with rename_src. Add a new rename_dst.is_rename field, since the rename_dst.p field is never NULL unlike the old rename_dst.pair field. Taking advantage of this change and the fact that same-named paths will be adjacent, we can get rid of the sorting of the array and most of the lookups on it, allowing us to instead just append as we go. However, there is one remaining reason to still keep locate_rename_dst(): handling broken pairs (i.e. when break detection is on). Those are somewhat rare, but we can set up a simple strintmap to get the map between the source and the index. Doing that allows us to still have a fast lookup without sorting the rename_dst array. Since the sorting had been done in a weakly quadratic manner, when many renames are involved this time could add up. There is still a strcmp() in add_rename_dst() that I have left in place to make it easier to verify that the algorithm has the same results. This strcmp() is there to check for duplicate destination entries (which was the easiest way at the time to avoid segfaults in the diffcore-rename code when trees had multiple entries at a given path). The underlying double free()s are no longer an issue with the new algorithm, but that can be addressed in a subsequent commit. This patch is being submitted in a different order than its original development, but in a large rebase of many commits with lots of renames and with several optimizations to inexact rename detection, both setup time and write back to output queue time from diffcore_rename() were sizeable chunks of overall runtime. This patch accelerated the setup time by about 65%, and final write back to the output queue time by about 50%, resulting in an overall drop of 3.5% on the execution time of rebasing a few dozen patches. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:34:50 -08:00
Elijah Newren	b970b4ef62	diffcore-rename: simplify and accelerate register_rename_src() register_rename_src() took pains to create an array in rename_src which was sorted by pathname of the contained diff_filepair. The sorting was entirely unnecessary since callers pass filepairs to us in sorted order. We can simply append to the end of the rename_src array, speeding up diffcore_rename() setup time. Also, note that I dropped the return type on the function since it was unconditionally discarded anyway. This patch is being submitted in a different order than its original development, but in a large rebase of many commits with lots of renames and with several optimizations to inexact rename detection, diffcore_rename() setup time was a sizeable chunk of overall runtime. This patch dropped execution time of rebasing 35 commits with lots of renames by 2% overall. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:34:50 -08:00
Elijah Newren	ac14de13b2	t4058: explore duplicate tree entry handling in a bit more detail While creating the last commit, I found a number of other cases where git would segfault when faced with trees that have duplicate entries. None of these segfaults are in the diffcore-rename code (they all occur in cache-tree and unpack-trees). Further, to my knowledge, no one has ever been adversely affected by these bugs, and given that it has been 15 years and folks have fixed a few other issues with historical duplicate entries (as noted in the last commit), I am not sure we will ever run into anyone having problems with these. So I am not sure these are worth fixing, but it doesn't hurt to at least document these failures in the same test file that is concerned with duplicate tree entries. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:34:50 -08:00
Elijah Newren	5c72261c66	t4058: add more tests and documentation for duplicate tree entry handling Commit `4d6be03b95` ("diffcore-rename: avoid processing duplicate destinations", 2015-02-26) added t4058 to demonstrate that a workaround it added to avoid double frees (namely to just turn off rename detection when trees had duplicate entries) would indeed avoid segfaults. The tests, though, give the impression that the expected diffs are "correct" when in reality they are just "don't segfault, and do something semi-reasonable under the circumstances". Add some notes to make this clearer. Also, commit `25d5ea410f` ("[PATCH] Redo rename/copy detection logic.", 2005-05-24) added a similar workaround to avoid segfaults, but for rename_src rather than rename_dst. I do not see any tests in the testsuite to cover the collision detection of entries limited to the source side, so add a couple. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:34:50 -08:00
Elijah Newren	81c4bf0296	diffcore-rename: reduce jumpiness in progress counters Inexact rename detection works by comparing all sources to all destinations, computing similarities, and then finding the best matches among those that are sufficiently similar. However, it is preceded by exact rename detection that works by checking if there are files with identical hashes. If exact renames are found, we can exclude some files from inexact rename detection. The inexact rename detection loops over the full set of files, but immediately skips those for which rename_dst[i].is_rename is true and thus doesn't compare any sources to that destination. As such, these paths shouldn't be included in the progress counter. For the eagle eyed, this change hints at an actual optimization -- the first one I presented at Git Merge 2020. I'll be submitting that optimization later, once the basic merge-ort algorithm has merged. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:34:50 -08:00
Elijah Newren	ad8a1be529	diffcore-rename: simplify limit check diffcore-rename had two different checks of the form if ((a < limit \|\| b < limit) && a * b <= limit * limit) This can be simplified to if (st_mult(a, b) <= st_mult(limit, limit)) which makes it clearer how we are checking for overflow, and makes it much easier to parse given the drop from 8 to 4 variable appearances. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:34:50 -08:00
Elijah Newren	00b8cccdd8	diffcore-rename: avoid usage of global in too_many_rename_candidates() too_many_rename_candidates() got the number of rename destinations via an argument to the function, but the number of rename sources via a global variable. That felt rather inconsistent. Pass in the number of rename sources as an argument as well. While we are at it... We had a local variable, num_src, that served two purposes. Initially it was set to the global value, but later was used for counting a subset of the number of sources. Since we now have a function argument for the former usage, introduce a clearer variable name for the latter usage. This patch has no behavioral changes; it's just renaming and passing an argument instead of grabbing it from the global namespace. (You may find it easier to view the patch using git diff's --color-words option.) Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:34:50 -08:00
Elijah Newren	26a66a6b1c	diffcore-rename: rename num_create to num_destinations Our main data structures are rename_src and rename_dst. For counters of these data structures, num_sources and num_destinations seem natural; definitely more so than using num_create for the latter. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:34:50 -08:00
Felipe Contreras	278f4be806	pull: give the advice for choosing rebase/merge much later Eventually we want to be omit the advice when we can fast-forward in which case there is no reason to require the user to choose between rebase or merge. In order to do so, we need to delay giving the advice up to the point where we can check if we can fast-forward or not. Additionally, config_get_rebase() was probably never its true home. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:03:17 -08:00
Felipe Contreras	77a7ec6329	pull: refactor fast-forward check We would like to be able to make this check before the decision to rebase is made in a future step. Besides, using a separate helper makes the code easier to follow. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 08:59:40 -08:00
Elijah Newren	c2d267df02	merge-ort: add basic outline for process_renames() Add code which determines which kind of special rename case each rename corresponds to, but leave the handling of each type unimplemented for now. Future commits will implement each one. There is some tenuous resemblance to merge-recursive's process_renames(), but comparing the two is very unlikely to yield any insights. merge-ort's process_renames() is a bit complex and I would prefer if I could simplify it more, but it is far easier to grok than merge-recursive's function of the same name in my opinion. Plus, merge-ort handles more rename conflict types than merge-recursive does. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 08:45:59 -08:00
Elijah Newren	965a7bc21c	merge-ort: implement compare_pairs() and collect_renames() Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 08:45:59 -08:00
Elijah Newren	f39d05ca26	merge-ort: implement detect_regular_renames() Based heavily on merge-recursive's get_diffpairs() function, and also includes the necessary paired call to diff_warn_rename_limit() so that users will be warned if merge.renameLimit is not sufficiently large for rename detection to run. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 08:45:59 -08:00
Elijah Newren	e1a124e8dc	merge-ort: add initial outline for basic rename detection Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 08:45:58 -08:00
Elijah Newren	864075ec43	merge-ort: add basic data structures for handling renames This will grow later, but we only need a few fields for basic rename handling. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 08:45:58 -08:00
Elijah Newren	c5a6f65527	merge-ort: add modify/delete handling and delayed output processing The focus here is on adding a path_msg() which will queue up warning/conflict/notice messages about the merge for later processing, storing these in a pathname -> strbuf map. It might seem like a big change, but it really just is: * declaration of necessary map with some comments * initialization and recording of data * a bunch of code to iterate over the map at print/free time * at least one caller in order to avoid an error about having an unused function (which we provide in the form of implementing modify/delete conflict handling). At this stage, it is probably not clear why I am opting for delayed output processing. There are multiple reasons: 1. Merges are supposed to abort if they would overwrite dirty changes in the working tree. We cannot correctly determine whether changes would be overwritten until both rename detection has occurred and full processing of entries with the renames has finalized. Warning/conflict/notice messages come up at intermediate codepaths along the way, so unless we want spurious conflict/warning messages being printed when the merge will be aborted anyway, we need to save these messages and only print them when relevant. 2. There can be multiple messages for a single path, and we want all messages for a give path to appear together instead of having them grouped by conflict/warning type. This was a problem already with merge-recursive.c but became even more important due to the splitting apart of conflict types as discussed in the commit message for `1f3c9ba707` ("t6425: be more flexible with rename/delete conflict messages", 2020-08-10) 3. Some callers might want to avoid showing the output in certain cases, such as if the end result is a clean merge. Rebases have typically done this. 4. Some callers might not want the output to go to stdout or even stderr, but might want to do something else with it entirely. For example, a --remerge-diff option to `git show` or `git log -p` that remerges on the fly and diffs merge commits against the remerged version would benefit from stdout/stderr not being written to in the standard form. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:38:47 -08:00
Elijah Newren	e2e9dc030c	merge-ort: add die-not-implemented stub handle_content_merge() function This simplistic and weird-looking patch is here to facilitate future patch submissions. Adding this stub allows rename detection code to reference it in one patch series, while a separate patch series can define the implementation, and then both series can merge cleanly and work nicely together at that point. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:38:47 -08:00
Elijah Newren	04af1879b9	merge-ort: add function grouping comments Commit b658536f59 ("merge-ort: add some high-level algorithm structure", 2020-10-27) added high-level structure of the ort merge algorithm. As we have added more and more functions, that high-level structure has been slightly obscured. Since functions are still grouped according to this high-level structure, add comments denoting sections where all the functions are specifically tied to a piece of the high-level structure. This function groupings include a few sub-divisions of the original high-level structure, including some sub-divisions that are yet to be submitted. Each has (or will have) several functions all serving as helpers to one or two main functions for each section. As an added bonus, the comments will serve to provide a small textual separation between nearby sections and allow the next three patch series to be submitted independently and merge cleanly. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:38:47 -08:00
Elijah Newren	43c1dccb91	merge-ort: add a paths_to_free field to merge_options_internal This field will be used in future patches to allow removal of paths from opt->priv->paths. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:38:47 -08:00
Elijah Newren	1c7873cdf4	merge-ort: add a path_conflict field to merge_options_internal This field is not yet used, but will be used by both the rename handling code, and the conflict type handling code in process_entry(). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:38:40 -08:00
Elijah Newren	101bc5bc2d	merge-ort: add a clear_internal_opts helper Move most of merge_finalize() into a new helper function, clear_internal_opts(). This is a step to facilitate recursive merges, as well as some future optimizations. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:21:03 -08:00
Elijah Newren	67845745c1	merge-ort: add a few includes Include blob.h for definition of blob_type, and commit-reach.h for declarations of get_merge_bases() and in_merge_bases(). While none of these are used yet, we want to avoid cross-dependencies in the next three series of patches for merge-ort and merge them at the end; adding these "#include"s now avoids textual conflicts. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:21:03 -08:00
Elijah Newren	89422d29b1	merge-ort: free data structures in merge_finalize() Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	ef2b369387	merge-ort: add implementation of record_conflicted_index_entries() After checkout(), the working tree has the appropriate contents, and the index matches the working copy. That means that all unmodified and cleanly merged files have correct index entries, but conflicted entries need to be updated. We do this by looping over the conflicted entries, marking the existing index entry for the path with CE_REMOVE, adding new higher order staged for the path at the end of the index (ignoring normal index sort order), and then at the end of the loop removing the CE_REMOVED-marked cache entries and sorting the index. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	70912f66de	tree: enable cmp_cache_name_compare() to be used elsewhere Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	6681ce5cf6	merge-ort: add implementation of checkout() Since merge-ort creates a tree for its output, when there are no conflicts, updating the working tree and index is as simple as using the unpack_trees() machinery with a twoway_merge (i.e. doing the equivalent of a "checkout" operation). If there were conflicts in the merge, then since the tree we created included all the conflict markers, then using the unpack_trees machinery in this manner will still update the working tree correctly. Further, all index entries corresponding to cleanly merged files will also be updated correctly by this procedure. Index entries corresponding to conflicted entries will appear as though the user had run "git add -u" after the merge to accept all files as-is with conflict markers. Thus, after running unpack_trees(), there needs to be a separate step for updating the entries in the index corresponding to conflicted files. This will be the job for the function record_conflicted_index_entris(), which will be implemented in a subsequent commit. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	9fefce68dc	merge-ort: basic outline for merge_switch_to_result() This adds a basic implementation for merge_switch_to_result(), though just in terms of a few new empty functions that will be defined in subsequent commits. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	bb470f4e13	merge-ort: step 3 of tree writing -- handling subdirectories as we go Our order for processing of entries means that if we have a tree of files that looks like Makefile src/moduleA/foo.c src/moduleA/bar.c src/moduleB/baz.c src/moduleB/umm.c tokens.txt Then we will process paths in the order of the leftmost column below. I have added two additional columns that help explain the algorithm that follows; the 2nd column is there to remind us we have oid & mode info we are tracking for each of these paths (which differs between the paths which I'm not representing well here), and the third column annotates the parent directory of the entry: tokens.txt <version_info> "" src/moduleB/umm.c <version_info> src/moduleB src/moduleB/baz.c <version_info> src/moduleB src/moduleB <version_info> src src/moduleA/foo.c <version_info> src/moduleA src/moduleA/bar.c <version_info> src/moduleA src/moduleA <version_info> src src <version_info> "" Makefile <version_info> "" When the parent directory changes, if it's a subdirectory of the previous parent directory (e.g. "" -> src/moduleB) then we can just keep appending. If the parent directory differs from the previous parent directory and is not a subdirectory, then we should process that directory. So, for example, when we get to this point: tokens.txt <version_info> "" src/moduleB/umm.c <version_info> src/moduleB src/moduleB/baz.c <version_info> src/moduleB and note that the next entry (src/moduleB) has a different parent than the last one that isn't a subdirectory, we should write out a tree for it 100644 blob <HASH> umm.c 100644 blob <HASH> baz.c then pop all the entries under that directory while recording the new hash for that directory, leaving us with tokens.txt <version_info> "" src/moduleB <new version_info> src This process repeats until at the end we get to tokens.txt <version_info> "" src <new version_info> "" Makefile <version_info> "" and then we can write out the toplevel tree. Since we potentially have entries in our string_list corresponding to multiple different toplevel directories, e.g. a slightly different repository might have: whizbang.txt <version_info> "" tokens.txt <version_info> "" src/moduleD <new version_info> src src/moduleC <new version_info> src src/moduleB <new version_info> src src/moduleA/foo.c <version_info> src/moduleA src/moduleA/bar.c <version_info> src/moduleA When src/moduleA is popped off, we need to know that the "last directory" reverts back to src, and how many entries in our string_list are associated with that parent directory. So I use an auxiliary offsets string_list which would have (parent_directory,offset) information of the form "" 0 src 2 src/moduleA 5 Whenever I write out a tree for a subdirectory, I set versions.nr to the final offset value and then decrement offsets.nr...and then add an entry to versions with a hash for the new directory. The idea is relatively simple, there's just a lot of accounting to implement this. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	ee4012dcf9	merge-ort: step 2 of tree writing -- function to create tree object Create a new function, write_tree(), which will take a list of basenames, modes, and oids for a single directory and create a tree object in the object-store. We do not yet have just basenames, modes, and oids for just a single directory (we have a mixture of entries from all directory levels in the hierarchy) so we still die() before the current call to write_tree(), but the next patch will rectify that. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	a9945bba60	merge-ort: step 1 of tree writing -- record basenames, modes, and oids As a step towards transforming the processed path->conflict_info entries into an actual tree object, start recording basenames, modes, and oids in a dir_metadata structure. Subsequent commits will make use of this to actually write a tree. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	8adffaa818	merge-ort: have process_entries operate in a defined order We want to handle paths below a directory before needing to handle the directory itself. Also, we want to handle the directory immediately after the paths below it, so we can't use simple lexicographic ordering from strcmp (which would insert foo.txt between foo and foo/file.c). Copy string_list_df_name_compare() from merge-recursive.c, and set up a string list of paths sorted by that function so that we can iterate in the desired order. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	6a02dd90c9	merge-ort: add a preliminary simple process_entries() implementation Add a process_entries() implementation that just loops over the paths and processes each one individually with an auxiliary process_entry() call. Add a basic process_entry() as well, which handles several cases but leaves a few of the more involved ones with die-not-implemented messages. Also, although process_entries() is supposed to create a tree, it does not yet have code to do so -- except in the special case of merging completely empty trees. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	291f29caf6	merge-ort: avoid recursing into identical trees When all three trees have the same oid, there is no need to recurse into these trees to find that all files within them happen to match. We can just record any one of the trees as the resolution of merging that particular path. Immediately resolving trees for other types of trivial tree merges (such as one side matches the merge base, or the two sides match each other) would prevent us from detecting renames for some paths, and thus prevent us from doing three-way content merges for those paths whose renames we did not detect. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	98bf984167	merge-ort: record stage and auxiliary info for every path Create a helper function, setup_path_info(), which can be used to record all the information we want in a merged_info or conflict_info. While there is currently only one caller of this new function, and some of its particular parameters are fixed, future callers of this function will be added later. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	34e557af54	merge-ort: compute a few more useful fields for collect_merge_info Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	885f0063e9	merge-ort: avoid repeating fill_tree_descriptor() on the same tree Three-way merges, by their nature, are going to often have two or more trees match at a given subdirectory. We can avoid calling fill_tree_descriptor() on the same tree by checking when these trees match. Noting when various oids match will also be useful in other calculations and optimizations as well. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:20 -08:00
Elijah Newren	d2bc1994f3	merge-ort: implement a very basic collect_merge_info() This does not actually collect any necessary info other than the pathnames involved, since it just allocates an all-zero conflict_info and stuffs that into paths. However, it invokes the traverse_trees() machinery to walk over all the paths and sets up the basic infrastructure we need. I have left out a few obvious optimizations to try to make this patch as short and obvious as possible. A subsequent patch will add some of those back in with some more useful data fields before we introduce a patch that actually sets up the conflict_info fields. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:19 -08:00
Elijah Newren	0c0d705b5c	merge-ort: add an err() function similar to one from merge-recursive Various places in merge-recursive used an err() function when it hit some kind of unrecoverable error. That code was from the reusable bits of merge-recursive.c that we liked, such as merge_3way, writing object files to the object store, reading blobs from the object store, etc. So create a similar function to allow us to port that code over, and use it for when we detect problems returned from collect_merge_info()'s traverse_trees() call, which we will be adding next. While we are at it, also add more documentation for the "clean" field from struct merge_result, particularly since the name suggests a boolean but it is not quite one and this is our first non-boolean usage. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:19 -08:00
Elijah Newren	c8017176ac	merge-ort: use histogram diff In my cursory investigation, histogram diffs are about 2% slower than Myers diffs. Others have probably done more detailed benchmarks. But, in short, histogram diffs have been around for years and in a number of cases provide obviously better looking diffs where Myers diffs are unintelligible but the performance hit has kept them from becoming the default. However, there are real merge bugs we know about that have triggered on git.git and linux.git, which I don't have a clue how to address without the additional information that I believe is provided by histogram diffs. See the following: https://lore.kernel.org/git/20190816184051.GB13894@sigill.intra.peff.net/ https://lore.kernel.org/git/CABPp-BHvJHpSJT7sdFwfNcPn_sOXwJi3=o14qjZS3M8Rzcxe2A@mail.gmail.com/ https://lore.kernel.org/git/CABPp-BGtez4qjbtFT1hQoREfcJPmk9MzjhY5eEq1QhXT23tFOw@mail.gmail.com/ I don't like mismerges. I really don't like silent mismerges. While I am sometimes willing to make performance and correctness tradeoff, I'm much more interested in correctness in general. I want to fix the above bugs. I have not yet started doing so, but I believe histogram diff at least gives me an angle. Unfortunately, I can't rely on using the information from histogram diff unless it's in use. And it hasn't been used because of a few percentage performance hit. In testcases I have looked at, merge-ort is _much_ faster than merge-recursive for non-trivial merges/rebases/cherry-picks. As such, this is a golden opportunity to switch out the underlying diff algorithm (at least the one used by the merge machinery; git-diff and git-log are separate questions); doing so will allow me to get additional data and improved diffs, and I believe it will help me fix the above bugs at some point in the future. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:19 -08:00
Elijah Newren	e4171b1b6d	merge-ort: port merge_start() from merge-recursive merge_start() basically does a bunch of sanity checks, then allocates and initializes opt->priv -- a struct merge_options_internal. Most of the sanity checks are usable as-is. The allocation/intialization is a bit different since merge-ort has a very different merge_options_internal than merge-recursive, but the idea is the same. The weirdest part here is that merge-ort and merge-recursive use the same struct merge_options, even though merge_options has a number of fields that are oddly specific to merge-recursive's internal implementation and don't even make sense with merge-ort's high-level design (e.g. buffer_output, which merge-ort has to always do). I reused the same data structure because: * most the fields made sense to both merge algorithms * making a new struct would have required making new enums or somehow externalizing them, and that was getting messy. * it simplifies converting the existing callers by not having to have different code paths for merge_options setup. I also marked detect_renames as ignored. We can revisit that later, but in short: merge-recursive allowed turning off rename detection because it was sometimes glacially slow. When you speed something up by a few orders of magnitude, it's worth revisiting whether that justification is still relevant. Besides, if folks find it's still too slow, perhaps they have a better scaling case than I could find and maybe it turns up some more optimizations we can add. If it still is needed as an option, it is easy to add later. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:19 -08:00
Elijah Newren	231e2dd49d	merge-ort: add some high-level algorithm structure merge_ort_nonrecursive_internal() will be used by both merge_inmemory_nonrecursive() and merge_inmemory_recursive(); let's focus on it for now. It involves some setup -- merge_start() -- followed by the following chain of functions: collect_merge_info() This function will populate merge_options_internal's paths field, via a call to traverse_trees() and a new callback that will be added later. detect_and_process_renames() This function will detect renames, and then adjust entries in paths to move conflict stages from old pathnames into those for new pathnames, so that the next step doesn't have to think about renames and just can do three-way content merging and such. process_entries() This function determines how to take the various stages (versions of a file from the three different sides) and merge them, and whether to mark the result as conflicted or cleanly merged. It also writes out these merged file versions as it goes to create a tree. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:19 -08:00
Elijah Newren	5b59c3db05	merge-ort: setup basic internal data structures Set up some basic internal data structures. The only carry-over from merge-recursive.c is call_depth, though needed_rename_limit will be added later. The central piece of data will definitely be the strmap "paths", which will map every relevant pathname under consideration to either a merged_info or a conflict_info. ("conflicted" is a strmap that is a subset of "paths".) merged_info contains all relevant information for a non-conflicted entry. conflict_info contains a merged_info, plus any additional information about a conflict such as the higher orders stages involved and the names of the paths those came from (handy once renames get involved). If an entry remains conflicted, the merged_info portion of a conflict_info will later be filled with whatever version of the file should be placed in the working directory (e.g. an as-merged-as-possible variation that contains conflict markers). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 14:18:19 -08:00
brian m. carlson	fac60b8925	rev-parse: add option for absolute or relative path formatting git rev-parse has several options which print various paths. Some of these paths are printed relative to the current working directory, and some are absolute. Normally, this is not a problem, but there are times when one wants paths entirely in one format or another. This can be done trivially if the paths are canonical, but canonicalizing paths is not possible on some shell scripting environments which lack realpath(1) and also in Go, which lacks functions that properly canonicalize paths on Windows. To help out the scripter, let's provide an option which turns most of the paths printed by git rev-parse to be either relative to the current working directory or absolute and canonical. Document which options are affected and which are not so that users are not confused. This approach is cleaner and tidier than providing duplicates of existing options which are either relative or absolute. Note that if the user needs both forms, it is possible to pass an additional option in the middle of the command line which changes the behavior of subsequent operations. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-12 23:35:51 -08:00
brian m. carlson	be6e0daee7	abspath: add a function to resolve paths with missing components Currently, we have a function to resolve paths, strbuf_realpath. This function canonicalizes paths like realpath(3), but permits a trailing component to be absent from the file system. In other words, this is the behavior of the GNU realpath(1) without any arguments. In the future, we'll need this same behavior, except that we want to allow for any number of missing trailing components, which is the behavior of GNU realpath(1) with the -m option. This is useful because we'll want to canonicalize a path that may point to a not yet present path under the .git directory. For example, a user may want to know where an arbitrary ref would be stored if it existed in the file system. Let's refactor strbuf_realpath to move most of the code to an internal function and then pass it two flags to control its behavior. We'll add a strbuf_realpath_forgiving function that has our new behavior, and leave strbuf_realpath with the older, stricter behavior. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-12 23:35:47 -08:00
Ævar Arnfjörð Bjarmason	2762e17117	pretty format %(trailers) doc: avoid repetition Change the documentation for the various %(trailers) options so it isn't repeating part of the documentation for "only" about how boolean values are handled. Instead, let's split the description of that into general documentation at the top. It then suffices to refer to it by listing the options as "opt[=<BOOL>]". I'm also changing it to upper-case "[=<BOOL>]" from "[=val]" for consistency with "<SEP>" It took me a couple of readings to realize that these options were referring back to the "only" option's treatment of boolean values. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-09 14:16:42 -08:00
Ævar Arnfjörð Bjarmason	058761f1c1	pretty format %(trailers): add a "key_value_separator" Add a "key_value_separator" option to the "%(trailers)" pretty format, to go along with the existing "separator" argument. In combination these two options make it trivial to produce machine-readable (e.g. \0 and \0\0-delimited) format output. As elaborated on in a previous commit which added "keyonly" it was needlessly tedious to extract structured data from "%(trailers)" before the addition of this "key_value_separator" option. As seen by the test being added here extracting this data now becomes trivial. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-09 14:16:42 -08:00
Ævar Arnfjörð Bjarmason	9d87d5ae02	pretty format %(trailers): add a "keyonly" Add support for a "keyonly". This allows for easier parsing out of the key and value. Before if you didn't want to make assumptions about how the key was formatted. You'd need to parse it out as e.g.: --pretty=format:'%H%x00%(trailers:separator=%x00%x00)' \ '%x00%(trailers:separator=%x00%x00,valueonly)' And then proceed to deduce keys by looking at those two and subtracting the value plus the hardcoded ": " separator from the non-valueonly %(trailers) line. Now it's possible to simply do: --pretty=format:'%H%x00%(trailers:separator=%x00%x00,keyonly)' \ '%x00%(trailers:separator=%x00%x00,valueonly)' Which at least reduces it to a state machine where you get N keys and correlate them with N values. Even better would be to have a way to change the ": " delimiter to something easily machine-readable (a key might contain ": " too). A follow-up change will add support for that. I don't really have a use-case for just "keyonly" myself. I suppose it would be useful in some cases as "key=*" matches case-insensitively, so a plain "keyonly" will give you the variants of the keys you matched. I'm mainly adding it to fix the inconsistency with "valueonly". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-09 14:16:42 -08:00
Ævar Arnfjörð Bjarmason	8b966a0506	pretty-format %(trailers): fix broken standalone "valueonly" Fix %(trailers:valueonly) being a noop due to on overly eager optimization in format_trailer_info() which skips custom formatting if no custom options are given. When "valueonly" was added in `d9b936db52` (pretty: add support for "valueonly" option in %(trailers), 2019-01-28) we forgot to add it to the list of options that optimization checks for. See e.g. the addition of "key" in `250bea0c16` (pretty: allow showing specific trailers, 2019-01-28) for a similar change where this wasn't missed. Thus the "valueonly" option in "%(trailers:valueonly)" was a noop and the output was equivalent to that of a plain "%(trailers)". This wasn't caught because the tests for it always combined it with other options. Fix the bug by adding !opts->value_only to the list. I initially attempted to make this more future-proof by setting a flag if we got to ":" in "%(trailers:" in format_commit_one() in pretty.c. However, "%(trailers:" is also parsed in trailers_atom_parser() in ref-filter.c. There is an outstanding patch[1] unify those two, and such a fix, or other future-proofing, such as changing "process_trailer_options" flags into a bitfield, would conflict with that effort. Let's instead do the bare minimum here as this aspect of trailers is being actively worked on by another series. Let's also test for a plain "valueonly" without any other options, as well as "separator". All the other existing options on the pretty.c path had tests where they were the only option provided. I'm also keeping a sanity test for "%(trailers:)" being the same as "%(trailers)". There's no reason to suspect it wouldn't be in the current implementation, but let's keep it in the interest of black box testing. 1. https://lore.kernel.org/git/pull.726.git.1599335291.gitgitgadget@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-09 14:16:42 -08:00
Derrick Stolee	f077b0a986	pack-bitmap-write: better reuse bitmaps If the old bitmap file contains a bitmap for a given commit, then that commit does not need help from intermediate commits in its history to compute its final bitmap. Eject that commit from the walk and insert it into a separate list of reusable commits that are eventually stored in the list of commits for computing bitmaps. This helps the repeat bitmap computation task, even if the selected commits shift drastically. This helps when a previously-bitmapped commit exists in the first-parent history of a newly-selected commit. Since we stop the walk at these commits and we use a first-parent walk, it is harder to walk "around" these bitmapped commits. It's not impossible, but we can greatly reduce the computation time for many selected commits. \| runtime (sec) \| peak heap (GB) \| \| \| \| \| from \| with \| from \| with \| \| scratch \| existing \| scratch \| existing \| -----------+---------+----------+---------+----------- last patch \| 88.478 \| 53.218 \| 2.157 \| 2.224 \| this patch \| 86.681 \| 16.164 \| 2.157 \| 2.222 \| Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:49:07 -08:00
Derrick Stolee	45f4eeb291	pack-bitmap-write: relax unique revwalk condition The previous commits improved the bitmap computation process for very long, linear histories with many refs by removing quadratic growth in how many objects were walked. The strategy of computing "intermediate commits" using bitmasks for which refs can reach those commits partitioned the poset of reachable objects so each part could be walked exactly once. This was effective for linear histories. However, there was a (significant) drawback: wide histories with many refs had an explosion of memory costs to compute the commit bitmasks during the exploration that discovers these intermediate commits. Since these wide histories are unlikely to repeat walking objects, the benefit of walking objects multiple times was not expensive before. But now, the commit walk before computing bitmaps is incredibly expensive. In an effort to discover a happy medium, this change reduces the walk for intermediate commits to only the first-parent history. This focuses the walk on how the histories converge, which still has significant reduction in repeat object walks. It is still possible to create quadratic behavior in this version, but it is probably less likely in realistic data shapes. Here is some data taken on a fresh clone of the kernel: \| runtime (sec) \| peak heap (GB) \| \| \| \| \| from \| with \| from \| with \| \| scratch \| existing \| scratch \| existing \| -----------+---------+----------+---------+----------- original \| 64.044 \| 83.241 \| 2.088 \| 2.194 \| last patch \| 45.049 \| 37.624 \| 2.267 \| 2.334 \| this patch \| 88.478 \| 53.218 \| 2.157 \| 2.224 \| Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:49:07 -08:00
Derrick Stolee	341fa34887	pack-bitmap-write: use existing bitmaps When constructing new bitmaps, we perform a commit and tree walk in fill_bitmap_commit() and fill_bitmap_tree(). This walk would benefit from using existing bitmaps when available. We must track the existing bitmaps and translate them into the new object order, but this is generally faster than parsing trees. In fill_bitmap_commit(), we must reorder thing somewhat. The priority queue walks commits from newest-to-oldest, which means we correctly stop walking when reaching a commit with a bitmap. However, if we walk trees interleaved with the commits, then we might be parsing trees that are actually part of a re-used bitmap. To avoid over-walking trees, add them to a LIFO queue and walk them after exploring commits completely. On git.git, this reduces a second immediate bitmap computation from 2.0s to 1.0s. On linux.git, we go from 32s to 22s. On chromium's fork network, we go from 227s to 198s. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:49:06 -08:00
Taylor Blau	83578051a9	pack-bitmap: factor out 'add_commit_to_bitmap()' 'find_objects()' currently needs to interact with the bitmaps khash pretty closely. To make 'find_objects()' read a little more straightforwardly, remove some of the khash-level details into a new function that describes what it does: 'add_commit_to_bitmap()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:49:06 -08:00
Taylor Blau	98c31f366a	pack-bitmap: factor out 'bitmap_for_commit()' A couple of callers within pack-bitmap.c duplicate logic to lookup a given object id in the bitamps khash. Factor this out into a new function, 'bitmap_for_commit()' to reduce some code duplication. Make this new function non-static, since it will be used in later commits from outside of pack-bitmap.c. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:49:04 -08:00
Jeff King	449fa5ee06	pack-bitmap-write: ignore BITMAP_FLAG_REUSE The on-disk bitmap format has a flag to mark a bitmap to be "reused". This is a rather curious feature, and works like this: - a run of pack-objects would decide to mark the last 80% of the bitmaps it generates with the reuse flag - the next time we generate bitmaps, we'd see those reuse flags from the last run, and mark those commits as special: - we'd be more likely to select those commits to get bitmaps in the new output - when generating the bitmap for a selected commit, we'd reuse the old bitmap as-is (rearranging the bits to match the new pack, of course) However, neither of these behaviors particularly makes sense. Just because a commit happened to be bitmapped last time does not make it a good candidate for having a bitmap this time. In particular, we may choose bitmaps based on how recent they are in history, or whether a ref tip points to them, and those things will change. We're better off re-considering fresh which commits are good candidates. Reusing the existing bitmap _is_ a reasonable thing to do to save computation. But only reusing exact bitmaps is a weak form of this. If we have an old bitmap for A and now want a new bitmap for its child, we should be able to compute that only by looking at trees and that are new to the child. But this code would consider only exact reuse (which is perhaps why it was eager to select those commits in the first place). Furthermore, the recent switch to the reverse-edge algorithm for generating bitmaps dropped this optimization entirely (and yet still performs better). So let's do a few cleanups: - drop the whole "reusing bitmaps" phase of generating bitmaps. It's not helping anything, and is mostly unused code (or worse, code that is using CPU but not doing anything useful) - drop the use of the on-disk reuse flag to select commits to bitmap - stop setting the on-disk reuse flag in bitmaps we generate (since nothing respects it anymore) We will keep a few innards of the reuse code, which will help us implement a more capable version of the "reuse" optimization: - simplify rebuild_existing_bitmaps() into a function that only builds the mapping of bits between the old and new orders, but doesn't actually convert any bitmaps - make rebuild_bitmap() public; we'll call it lazily to convert bitmaps as we traverse (using the mapping created above) Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:17 -08:00
Derrick Stolee	089f751360	pack-bitmap-write: build fewer intermediate bitmaps The bitmap_writer_build() method calls bitmap_builder_init() to construct a list of commits reachable from the selected commits along with a "reverse graph". This reverse graph has edges pointing from a commit to other commits that can reach that commit. After computing a reachability bitmap for a commit, the values in that bitmap are then copied to the reachability bitmaps across the edges in the reverse graph. We can now relax the role of the reverse graph to greatly reduce the number of intermediate reachability bitmaps we compute during this reverse walk. The end result is that we walk objects the same number of times as before when constructing the reachability bitmaps, but we also spend much less time copying bits between bitmaps and have much lower memory pressure in the process. The core idea is to select a set of "important" commits based on interactions among the sets of commits reachable from each selected commit. The first technical concept is to create a new 'commit_mask' member in the bb_commit struct. Note that the selected commits are provided in an ordered array. The first thing to do is to mark the ith bit in the commit_mask for the ith selected commit. As we walk the commit-graph, we copy the bits in a commit's commit_mask to its parents. At the end of the walk, the ith bit in the commit_mask for a commit C stores a boolean representing "The ith selected commit can reach C." As we walk, we will discover non-selected commits that are important. We will get into this later, but those important commits must also receive bit positions, growing the width of the bitmasks as we walk. At the true end of the walk, the ith bit means "the ith _important_ commit can reach C." MAXIMAL COMMITS --------------- We use a new 'maximal' bit in the bb_commit struct to represent whether a commit is important or not. The term "maximal" comes from the partially-ordered set of commits in the commit-graph where C >= P if P is a parent of C, and then extending the relationship transitively. Instead of taking the maximal commits across the entire commit-graph, we instead focus on selecting each commit that is maximal among commits with the same bits on in their commit_mask. This definition is important, so let's consider an example. Suppose we have three selected commits A, B, and C. These are assigned bitmasks 100, 010, and 001 to start. Each of these can be marked as maximal immediately because they each will be the uniquely maximal commit that contains their own bit. Keep in mind that that these commits may have different bitmasks after the walk; for example, if B can reach C but A cannot, then the final bitmask for C is 011. Even in these cases, C would still be a maximal commit among all commits with the third bit on in their masks. Now define sets X, Y, and Z to be the sets of commits reachable from A, B, and C, respectively. The intersections of these sets correspond to different bitmasks: * 100: X - (Y union Z) * 010: Y - (X union Z) * 001: Z - (X union Y) * 110: (X intersect Y) - Z * 101: (X intersect Z) - Y * 011: (Y intersect Z) - X * 111: X intersect Y intersect Z This can be visualized with the following Hasse diagram: 100 010 001 \| \ / \ / \| \| \/ \/ \| \| /\ /\ \| \| / \ / \ \| 110 101 011 \___ \| ___/ \ \| / 111 Some of these bitmasks may not be represented, depending on the topology of the commit-graph. In fact, we are counting on it, since the number of possible bitmasks is exponential in the number of selected commits, but is also limited by the total number of commits. In practice, very few bitmasks are possible because most commits converge on a common "trunk" in the commit history. With this three-bit example, we wish to find commits that are maximal for each bitmask. How can we identify this as we are walking? As we walk, we visit a commit C. Since we are walking the commits in topo-order, we know that C is visited after all of its children are visited. Thus, when we get C from the revision walk we inspect the 'maximal' property of its bb_data and use that to determine if C is truly important. Its commit_mask is also nearly final. If C is not one of the originally-selected commits, then assign a bit position to C (by incrementing num_maximal) and set that bit on in commit_mask. See "MULTIPLE MAXIMAL COMMITS" below for more detail on this. Now that the commit C is known to be maximal or not, consider each parent P of C. Compute two new values: * c_not_p : true if and only if the commit_mask for C contains a bit that is not contained in the commit_mask for P. * p_not_c : true if and only if the commit_mask for P contains a bit that is not contained in the commit_mask for P. If c_not_p is false, then P already has all of the bits that C would provide to its commit_mask. In this case, move on to other parents as C has nothing to contribute to P's state that was not already provided by other children of P. We continue with the case that c_not_p is true. This means there are bits in C's commit_mask to copy to P's commit_mask, so use bitmap_or() to add those bits. If p_not_c is also true, then set the maximal bit for P to one. This means that if no other commit has P as a parent, then P is definitely maximal. This is because no child had the same bitmask. It is important to think about the maximal bit for P at this point as a temporary state: "P is maximal based on current information." In contrast, if p_not_c is false, then set the maximal bit for P to zero. Further, clear all reverse_edges for P since any edges that were previously assigned to P are no longer important. P will gain all reverse edges based on C. The final thing we need to do is to update the reverse edges for P. These reverse edges respresent "which closest maximal commits contributed bits to my commit_mask?" Since C contributed bits to P's commit_mask in this case, C must add to the reverse edges of P. If C is maximal, then C is a 'closest' maximal commit that contributed bits to P. Add C to P's reverse_edges list. Otherwise, C has a list of maximal commits that contributed bits to its bitmask (and this list is exactly one element). Add all of these items to P's reverse_edges list. Be careful to ignore duplicates here. After inspecting all parents P for a commit C, we can clear the commit_mask for C. This reduces the memory load to be limited to the "width" of the commit graph. Consider our ABC/XYZ example from earlier and let's inspect the state of the commits for an interesting bitmask, say 011. Suppose that D is the only maximal commit with this bitmask (in the first three bits). All other commits with bitmask 011 have D as the only entry in their reverse_edges list. D's reverse_edges list contains B and C. COMPUTING REACHABILITY BITMAPS ------------------------------ Now that we have our definition, let's zoom out and consider what happens with our new reverse graph when computing reachability bitmaps. We walk the reverse graph in reverse-topo-order, so we visit commits with largest commit_masks first. After we compute the reachability bitmap for a commit C, we push the bits in that bitmap to each commit D in the reverse edge list for C. Then, when we finally visit D we already have the bits for everything reachable from maximal commits that D can reach and we only need to walk the objects in the set-difference. In our ABC/XYZ example, when we finally walk for the commit A we only need to walk commits with bitmask equal to A's bitmask. If that bitmask is 100, then we are only walking commits in X - (Y union Z) because the bitmap already contains the bits for objects reachable from (X intersect Y) union (X intersect Z) (i.e. the bits from the reachability bitmaps for the maximal commits with bitmasks 110 and 101). The behavior is intended to walk each commit (and the trees that commit introduces) at most once while allocating and copying fewer reachability bitmaps. There is one caveat: what happens when there are multiple maximal commits with the same bitmask, with respect to the initial set of selected commits? MULTIPLE MAXIMAL COMMITS ------------------------ Earlier, we mentioned that when we discover a new maximal commit, we assign a new bit position to that commit and set that bit position to one for that commit. This is absolutely important for interesting commit-graphs such as git/git and torvalds/linux. The reason is due to the existence of "butterflies" in the commit-graph partial order. Here is an example of four commits forming a butterfly: I J \|\ /\| \| \/ \| \| /\ \| \|/ \\| M N \ / \|/ Q Here, I and J both have parents M and N. In general, these do not need to be exact parent relationships, but reachability relationships. The most important part is that M and N cannot reach each other, so they are independent in the partial order. If I had commit_mask 10 and J had commit_mask 01, then M and N would both be assigned commit_mask 11 and be maximal commits with the bitmask 11. Then, what happens when M and N can both reach a commit Q? If Q is also assigned the bitmask 11, then it is not maximal but is reachable from both M and N. While this is not necessarily a deal-breaker for our abstract definition of finding maximal commits according to a given bitmask, we have a few issues that can come up in our larger picture of constructing reachability bitmaps. In particular, if we do not also consider Q to be a "maximal" commit, then we will walk commits reachable from Q twice: once when computing the reachability bitmap for M and another time when computing the reachability bitmap for N. This becomes much worse if the topology continues this pattern with multiple butterflies. The solution has already been mentioned: each of M and N are assigned their own bits to the bitmask and hence they become uniquely maximal for their bitmasks. Finally, Q also becomes maximal and thus we do not need to walk its commits multiple times. The final bitmasks for these commits are as follows: I:10 J:01 \|\ /\| \| \ _____/ \| \| /\____ \| \|/ \ \| M:111 N:1101 \ / Q:1111 Further, Q's reverse edge list is { M, N }, while M and N both have reverse edge list { I, J }. PERFORMANCE MEASUREMENTS ------------------------ Now that we've spent a LOT of time on the theory of this algorithm, let's show that this is actually worth all that effort. To test the performance, use GIT_TRACE2_PERF=1 when running 'git repack -abd' in a repository with no existing reachability bitmaps. This avoids any issues with keeping existing bitmaps to skew the numbers. Inspect the "building_bitmaps_total" region in the trace2 output to focus on the portion of work that is affected by this change. Here are the performance comparisons for a few repositories. The timings are for the following versions of Git: "multi" is the timing from before any reverse graph is constructed, where we might perform multiple traversals. "reverse" is for the previous change where the reverse graph has every reachable commit. Finally "maximal" is the version introduced here where the reverse graph only contains the maximal commits. Repository: git/git multi: 2.628 sec reverse: 2.344 sec maximal: 2.047 sec Repository: torvalds/linux multi: 64.7 sec reverse: 205.3 sec maximal: 44.7 sec So in all cases we've not only recovered any time lost to switching to the reverse-edge algorithm, but we come out ahead of "multi" in all cases. Likewise, peak heap has gone back to something reasonable: Repository: torvalds/linux multi: 2.087 GB reverse: 3.141 GB maximal: 2.288 GB While I do not have access to full fork networks on GitHub, Peff has run this algorithm on the chromium/chromium fork network and reported a change from 3 hours to ~233 seconds. That network is particularly beneficial for this approach because it has a long, linear history along with many tags. The "multi" approach was obviously quadratic and the new approach is linear. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:17 -08:00
Taylor Blau	c6b0c3910c	pack-bitmap.c: check reads more aggressively when loading Before 'load_bitmap_entries_v1()' reads an actual EWAH bitmap, it should check that it can safely do so by ensuring that there are at least 6 bytes available to be read (four for the commit's index position, and then two more for the xor offset and flags, respectively). Likewise, it should check that the commit index it read refers to a legitimate object in the pack. The first fix catches a truncation bug that was exposed when testing, and the second is purely precautionary. There are some possible future improvements, not pursued here. They are: - Computing the correct boundary of the bitmap itself in the caller and ensuring that we don't read past it. This may or may not be worth it, since in a truncation situation, all bets are off: (is the trailer still there and the bitmap entries malformed, or is the trailer truncated?). The best we can do is try to read what's there as if it's correct data (and protect ourselves when it's obviously bogus). - Avoid the magic "6" by teaching read_be32() and read_u8() (both of which are custom helpers for this function) to check sizes before advancing the pointers. - Adding more tests in this area. Testing these truncation situations are remarkably fragile to even subtle changes in the bitmap generation. So, the resulting tests are likely to be quite brittle. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:17 -08:00
Derrick Stolee	928e3f42ad	pack-bitmap-write: rename children to reverse_edges The bitmap_builder_init() method walks the reachable commits in topological order and constructs a "reverse graph" along the way. At the moment, this reverse graph contains an edge from commit A to commit B if and only if A is a parent of B. Thus, the name "children" is appropriate for for this reverse graph. In the next change, we will repurpose the reverse graph to not be directly-adjacent commits in the commit-graph, but instead a more abstract relationship. The previous changes have already incorporated the necessary updates to fill_bitmap_commit() that allow these edges to not be immediate children. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:17 -08:00
Derrick Stolee	1467b9572a	t5310: add branch-based checks The current rev-list tests that check the bitmap data only work on HEAD instead of multiple branches. Expand the test cases to handle both 'master' and 'other' branches. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:17 -08:00
Derrick Stolee	597b2c39af	commit: implement commit_list_contains() It can be helpful to check if a commit_list contains a commit. Use pointer equality, assuming lookup_commit() was used. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:16 -08:00
Derrick Stolee	ed03a58b65	bitmap: implement bitmap_is_subset() The bitmap_is_subset() function checks if the 'self' bitmap contains any bitmaps that are not on in the 'other' bitmap. Up until this patch, it had a declaration, but no implementation or callers. A subsequent patch will want this function, so implement it here. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:16 -08:00
Derrick Stolee	6dc5ef759f	pack-bitmap-write: fill bitmap with commit history The current implementation of bitmap_writer_build() creates a reachability bitmap for every walked commit. After computing a bitmap for a commit, those bits are pushed to an in-progress bitmap for its children. fill_bitmap_commit() assumes the bits corresponding to objects reachable from the parents of a commit are already set. This means that when visiting a new commit, we only have to walk the objects reachable between it and any of its parents. A future change to bitmap_writer_build() will relax this condition so not all parents have their bits set. Prepare for that by having 'fill_bitmap_commit()' walk parents until reaching commits whose bits are already set. Then, walk the trees for these commits as well. This has no functional change with the current implementation of bitmap_writer_build(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:16 -08:00
Jeff King	010e5eacfb	pack-bitmap-write: pass ownership of intermediate bitmaps Our algorithm to generate reachability bitmaps walks through the commit graph from the bottom up, passing bitmap data from each commit to its descendants. For a linear stretch of history like: A -- B -- C our sequence of steps is: - compute the bitmap for A by walking its trees, etc - duplicate A's bitmap as a starting point for B; we can now free A's bitmap, since we only needed it as an intermediate result - OR in any extra objects that B can reach into its bitmap - duplicate B's bitmap as a starting point for C; likewise, free B's bitmap - OR in objects for C, and so on... Rather than duplicating bitmaps and immediately freeing the original, we can just pass ownership from commit to commit. Note that this doesn't always work: - the recipient may be a merge which already has an intermediate bitmap from its other ancestor. In that case we have to OR our result into it. Note that the first ancestor to reach the merge does get to pass ownership, though. - we may have multiple children; we can only pass ownership to one of them However, it happens often enough and copying bitmaps is expensive enough that this provides a noticeable speedup. On a clone of linux.git, this reduces the time to generate bitmaps from 205s to 70s. This is about the same amount of time it took to generate bitmaps using our old "many traversals" algorithm (the previous commit measures the identical scenario as taking 63s). It unfortunately provides only a very modest reduction in the peak memory usage, though. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:16 -08:00
Jeff King	4a9c581729	pack-bitmap-write: reimplement bitmap writing The bitmap generation code works by iterating over the set of commits for which we plan to write bitmaps, and then for each one performing a traditional traversal over the reachable commits and trees, filling in the bitmap. Between two traversals, we can often reuse the previous bitmap result as long as the first commit is an ancestor of the second. However, our worst case is that we may end up doing "n" complete complete traversals to the root in order to create "n" bitmaps. In a real-world case (the shared-storage repo consisting of all GitHub forks of chromium/chromium), we perform very poorly: generating bitmaps takes ~3 hours, whereas we can walk the whole object graph in ~3 minutes. This commit completely rewrites the algorithm, with the goal of accessing each object only once. It works roughly like this: - generate a list of commits in topo-order using a single traversal - invert the edges of the graph (so have parents point at their children) - make one pass in reverse topo-order, generating a bitmap for each commit and passing the result along to child nodes We generate correct results because each node we visit has already had all of its ancestors added to the bitmap. And we make only two linear passes over the commits. We also visit each tree usually only once. When filling in a bitmap, we don't bother to recurse into trees whose bit is already set in the bitmap (since we know we've already done so when setting their bit). That means that if commit A references tree T, none of its descendants will need to open T again. I say "usually", though, because it is possible for a given tree to be mentioned in unrelated parts of history (e.g., cherry-picking to a parallel branch). So we've accomplished our goal, and the resulting algorithm is pretty simple to understand. But there are some downsides, at least with this initial implementation: - we no longer reuse the results of any on-disk bitmaps when generating. So we'd expect to sometimes be slower than the original when bitmaps already exist. However, this is something we'll be able to add back in later. - we use much more memory. Instead of keeping one bitmap in memory at a time, we're passing them up through the graph. So our memory use should scale with the graph width (times the size of a bitmap). So how does it perform? For a clone of linux.git, generating bitmaps from scratch with the old algorithm took 63s. Using this algorithm it takes 205s. Which is much worse, but _might_ be acceptable if it behaved linearly as the size grew. It also increases peak heap usage by ~1G. That's not impossibly large, but not encouraging. On the complete fork-network of torvalds/linux, it increases the peak RAM usage by 40GB. Yikes. (I forgot to record the time it took, but the memory usage was too much to consider this reasonable anyway). On the complete fork-network of chromium/chromium, I ran out of memory before succeeding. Some back-of-the-envelope calculations indicate it would need 80+GB to complete. So at this stage, we've managed to make things much worse. But because of the way this new algorithm is structured, there are a lot of opportunities for optimization on top. We'll start implementing those in the follow-on patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:16 -08:00
Jeff King	ccae08e822	ewah: add bitmap_dup() function There's no easy way to make a copy of a bitmap. Obviously a caller can iterate over the bits and set them one by one in a new bitmap, but we can go much faster by copying whole words with memcpy(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:16 -08:00
Jeff King	3ed675101a	ewah: implement bitmap_or() We have a function to bitwise-OR an ewah into an uncompressed bitmap, but not to OR two uncompressed bitmaps. Let's add it. Interestingly, we have a public header declaration going back to `e1273106f6` (ewah: compressed bitmap implementation, 2013-11-14), but the function was never implemented. That was all OK since there were no users of 'bitmap_or()', but a first caller will be added in a couple of patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:16 -08:00
Jeff King	2e2d141afd	ewah: make bitmap growth less aggressive If you ask to set a bit in the Nth word and we haven't yet allocated that many slots in our array, we'll increase the bitmap size to 2N. This means we might frequently end up with bitmaps that are twice the necessary size (as soon as you ask for the biggest bit, we'll size up to twice that). But if we just allocate as many words as were asked for, we may not grow fast enough. The worst case there is setting bit 0, then 1, etc. Each time we grow we'd just extend by one more word, giving us linear reallocations (and quadratic memory copies). A middle ground is relying on alloc_nr(), which causes us to grow by a factor of roughly 3/2 instead of 2. That's less aggressive than doubling, and it may help avoid fragmenting memory. (If we start with N, then grow twice, our total is N(3/2)^2 = 9N/4. After growing twice, that array of size 9N/4 can fit into the space vacated by the original array and first growth, N+3N/2 = 10N/4 > 9N/4, leading to less fragmentation in memory). Our worst case is still 3/2N wasted bits (you set bit N-1, then setting bit N causes us to grow by 3/2), but our average should be much better. This isn't usually that big a deal, but it will matter as we shift the reachability bitmap generation code to store more bitmaps in memory. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:16 -08:00
Jeff King	d574bf43e8	ewah: factor out bitmap growth We auto-grow bitmaps when somebody asks to set a bit whose position is outside of our currently allocated range. Other operations besides single bit-setting might need to do this, too, so let's pull it into its own function. Note that we change the semantics a little: you now ask for the number of words you'd like to have, not the id of the block you'd like to write to. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:16 -08:00
Jeff King	2978b00691	rev-list: die when --test-bitmap detects a mismatch You can use "git rev-list --test-bitmap HEAD" to check that bitmaps produce the same answer we'd get from a regular traversal. But if we detect an error, we only print "mismatch", and still exit with a successful error code. That makes the uses of --test-bitmap in the test suite (e.g., in t5310) mostly pointless: even if we saw an error, the tests wouldn't notice. Let's instead call die(), which will let these tests work as designed, and alert us if the bitmaps are bogus. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:15 -08:00
Jeff King	c5cd749076	t5310: drop size of truncated ewah bitmap We truncate the .bitmap file to 512 bytes and expect to run into problems reading an individual ewah file. But this length is somewhat arbitrary, and just happened to work when the test was added in `9d2e330b17` (ewah_read_mmap: bounds-check mmap reads, 2018-06-14). An upcoming commit will change the size of the history we create in the test repo, which will cause this test to fail. We can future-proof it a bit more by reducing the size of the truncated bitmap file. Signed-off-by: Jeff King <peff@peff.net> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:15 -08:00
Jeff King	ec6c7b4367	pack-bitmap: bounds-check size of cache extension A .bitmap file may have a "name hash cache" extension, which puts a sequence of uint32_t values (one per object) at the end of the file. When we see a flag indicating this extension, we blindly subtract the appropriate number of bytes from our available length. However, if the .bitmap file is too short, we'll underflow our length variable and wrap around, thinking we have a very large length. This can lead to reading out-of-bounds bytes while loading individual ewah bitmaps. We can fix this by checking the number of available bytes when we parse the header. The existing "truncated bitmap" test is now split into two tests: one where we don't have this extension at all (and hence actually do try to read a truncated ewah bitmap) and one where we realize up-front that we can't even fit in the cache structure. We'll check stderr in each case to make sure we hit the error we're expecting. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:15 -08:00
Jeff King	ca51090200	pack-bitmap: fix header size check When we parse a .bitmap header, we first check that we have enough bytes to make a valid header. We do that based on sizeof(struct bitmap_disk_header). However, as of `0f4d6cada8` (pack-bitmap: make bitmap header handling hash agnostic, 2019-02-19), that struct oversizes its checksum member to GIT_MAX_RAWSZ. That means we need to adjust for the difference between that constant and the size of the actual hash we're using. That commit adjusted the code which moves our pointer forward, but forgot to update the size check. This meant we were overly strict about the header size (requiring room for a 32-byte worst-case hash, when sha1 is only 20 bytes). But in practice it didn't matter because bitmap files tend to have at least 12 bytes of actual data anyway, so it was unlikely for a valid file to be caught by this. Let's fix it by pulling the header size into a separate variable and using it in both spots. That fixes the bug and simplifies the code to make it harder to have a mismatch like this in the future. It will also come in handy in the next patch for more bounds checking. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:15 -08:00
Taylor Blau	3b1ca60f8f	ewah/ewah_bitmap.c: avoid open-coding ALLOC_GROW() 'ewah/ewah_bitmap.c:buffer_grow()' is responsible for growing the buffer used to store the bits of an EWAH bitmap. It is essentially doing the same task as the 'ALLOC_GROW()' macro, so use that instead. This simplifies the callers of 'buffer_grow()', who no longer have to ask for a specific size, but rather specify how much of the buffer they need. They also no longer need to guard 'buffer_grow()' behind an if statement, since 'ALLOC_GROW()' (and, by extension, 'buffer_grow()') is a noop if the buffer is already large enough. But, the most significant change is that this fixes a bug when calling buffer_grow() with both 'alloc_size' and 'new_size' set to 1. In this case, truncating integer math will leave the new size set to 1, causing the buffer to never grow. Instead, let alloc_nr() handle this, which asks for '(new_size + 16) * 3 / 2' instead of 'new_size * 3 / 2'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:15 -08:00
Sangeeta Jain	8ef9312464	diff: do not show submodule with untracked files as "-dirty" Git diff reports a submodule directory as -dirty even when there are only untracked files in the submodule directory. This is inconsistent with what `git describe --dirty` says when run in the submodule directory in that state. Make `--ignore-submodules=untracked` the default for `git diff` when there is no configuration variable or command line option, so that the command would not give '-dirty' suffix to a submodule whose working tree has untracked files, to make it consistent with `git describe --dirty` that is run in the submodule working tree. And also make `--ignore-submodules=none` the default for `git status` so that the user doesn't end up deleting a submodule that has uncommitted (untracked) files. Signed-off-by: Sangeeta Jain <sangunb09@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:27:35 -08:00
Ævar Arnfjörð Bjarmason	7c1f79fc16	pretty format %(trailers) test: split a long line Split a very long line in a test introduced in `0b691d8685` (pretty: add support for separator option in %(trailers), 2019-01-28). This makes it easier to read, especially as follow-up commits will copy this test as a template. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-07 10:23:11 -08:00
Derrick Stolee	16c5690929	maintenance: include 'cron' details in docs Advanced and expert users may want to know how 'git maintenance start' schedules background maintenance in order to customize their own schedules beyond what the maintenance.* config values allow. Start a new set of sections in git-maintenance.txt that describe how 'cron' is used to run these tasks. This is particularly valuable for users who want to inspect what Git is doing or for users who want to customize the schedule further. Having a baseline can provide a way forward for users who have never worked with cron schedules. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-24 13:02:29 -08:00
Derrick Stolee	31345d5545	maintenance: extract platform-specific scheduling The existing schedule mechanism using 'cron' is supported by POSIX platforms, but not Windows. It also works slightly differently on macOS to significant detriment of the user experience. To allow for new implementations on these platforms, extract a method that performs the platform-specific scheduling mechanism. This will be swapped at compile time with new implementations on specialized platforms. As we add this generality, rename GIT_TEST_CRONTAB to GIT_TEST_MAINT_SCHEDULER. Further, this variable is now parsed as "<scheduler>:<command>" so we can test platform-specific scheduling logic even when not on the correct platform. By specifying the <scheduler> in this string, we will be able to test all three sets of Git logic from a Linux machine. Co-authored-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-24 13:02:29 -08:00
Johannes Schindelin	8b70966aa9	tests: drop prereq `PREPARE_FOR_MAIN_BRANCH` where no longer needed We introduced the `PREPARE_FOR_MAIN_BRANCH` prereq for the sole purpose of allowing us to perform the non-trivial adjustments regarding the `master` -> `main` rename before the automatable ones. Now that the transition is almost complete, we can stop using it in most instances. The only two exceptions are t5526 and t9902: at the time of writing, there are other patches in flight that touch these test scripts, therefore their transition to `main` is postponed to a later date. This patch is the result of this command: sed -i 's/PREPARE_FOR_MAIN_BRANCH[ ,]//' t/t[0-9].sh && git checkout HEAD -- t/t5526\ t/t9902\* Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	8dcf73c5c9	t99: adjust the references to the default branch name "main" Carefully excluding t9902, which sees independent development elsewhere at the time of writing, we use `main` as the default branch name in t9903. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t99.sh lib-cvs.sh && git checkout HEAD -- t9902\*) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for all tests (except the ones we specifically excluded for now). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	46a29020bb	tests(git-p4): transition to the default branch name `main` In the previous commits, we adjusted the test suite to use the branch name `main` for initial branches. The `git p4`-related tests are a bit harder to adjust because `git p4` uses the ref `refs/heads/p4/master` to track the remote branches, and for now, we do not want to change that (this might be the subject of a future patch series). We only need to adjust for the actual initial branch name to be changed to `main`. This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	765577b5d0	t9[5-7]: adjust the references to the default branch name "main" This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t9[5-7].sh lib-cvs.sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	a881baa2c3	t9[0-4]: adjust the references to the default branch name "main" This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t9[0-4].sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	747f6c6805	t8: adjust the references to the default branch name "main" This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t8.sh annotate*.sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	1e2ae142c0	t7[5-9]: adjust the references to the default branch name "main" Excluding t7817, which is added in an unrelated patch series at the time of writing, this adjusts t7[5-9]. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t7[5-9]*.sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	01dc81336d	t7[0-4]: adjust the references to the default branch name "main" Carefully excluding t7064, which sees independent development elsewhere at the time of writing, we use `main` as the default branch name in t7[0-4]. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t7[0-4].sh && git checkout HEAD -- t7064\) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	5902f5f460	t6[4-9]: adjust the references to the default branch name "main" This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t6[4-9].sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	1f53df54eb	t64*: preemptively adjust alignment to prepare for `master` -> `main` We are in the process of renaming the default branch name to `main`, which is two characters shorter than `master`. Therefore, some lines need to be adjusted in t6416, t6422 and t6427 that want to align text involving the default branch name. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	1550bb6ed0	t6[0-3]: adjust the references to the default branch name "main" Carefully excluding t6300, which sees independent development elsewhere at the time of writing, we use `main` as the default branch name in t6[0-3]. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t6[0-3].sh && git checkout HEAD -- t6300\) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	95cf2c0187	t5[6-9]: adjust the references to the default branch name "main" This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t5[6-9].sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	028cb644ec	t55[4-9]: adjust the references to the default branch name "main" This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -e 's/retsam/niam/g' \ -- t55[4-9].sh t556x*) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Note that t5541 uses the reversed `master` name: `retsam`. We replace it by the equivalent for `main`: `niam`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	3ac8f6301e	t55[23]: adjust the references to the default branch name "main" Carefully excluding t5526, which sees independent development elsewhere at the time of writing, we use `main` as the default branch name in t55[23]. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -e 's/naster/nain/g' -- \ t55[23].sh && git checkout HEAD -- t5526\) Note that t5533 contains a variation of the name `master` (`naster`) that we rename here, too. This commit allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for that range of tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	bc925ce3f3	t551: adjust the references to the default branch name "main" This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t551.sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	3275f4e886	t550: adjust the references to the default branch name "main" This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t550.sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	e4010de9f0	t5503: prepare aligned comment for replacing `master` with `main` In an upcoming commit, we will use `main` as the default branch name in t5503 instead of `master`. This will require extra padding in ASCII-art commit graphs, which we hereby add preemptively. By doing this preemptively rather than after the commit applying the search-and-replace, it is more obvious that we caught all aligned comments that are affected by the latter commit. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	966b4be276	t5[0-4]: adjust the references to the default branch name "main" Carefully excluding t5310, which is developed independently of the current patch series at the time of writing, we now use `main` as default branch in t5[0-4]. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t5[0-4].sh && git checkout HEAD -- t5310\) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	4b071211e6	t5323: prepare centered comment for `master` -> `main` We are about to search-and-replace all mentions of `master` in t5323 by `main`, which is two characters shorter. To prepare for that, let's add padding to centered lines that will make them briefly uncentered, but will be re-centered in the commit that performs that rename. Doing it this way (instead of padding after replacing) makes it easier to verify the validity of the patch that replaces `master` by `main`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	8f37854b18	t4: adjust the references to the default branch name "main" Carefully excluding t4013 and t4015, which see independent development elsewhere at the time of writing, we use `main` as the default branch name in t4. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t4.sh t4211/.export && git checkout HEAD -- t4013\*) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	cbc75a12f0	t3[5-9]: adjust the references to the default branch name "main" This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t3[5-9].sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	d1c02d93b3	t34: adjust the references to the default branch name "main" Carefully excluding t3404, which sees independent development elsewhere at the time of writing, we use `main` as the default branch name in t34. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t34.sh && git checkout HEAD -- t34\) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	ba766eebee	t3416: preemptively adjust alignment in a comment We are about to adjust t3416 for the new default branch name `main`. This name is two characters shorter and therefore needs two spaces more padding to align correctly. Adjusting the alignment before the big search-and-replace makes it easier to verify that the final result does not leave any misaligned lines behind. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	d6c6b10817	t3[0-3]: adjust the references to the default branch name "main" Carefully excluding t3040, which sees independent development elsewhere at the time of writing, we transition above-mentioned tests to the default branch name `main`. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t3[0-3].sh t3206/* && git checkout HEAD -- t3040\*) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	883b98efad	t2: adjust the references to the default branch name "main" Carefully excluding t2106, which sees independent development elsewhere at the time of writing, we transition above-mentioned tests to the default branch name `main`. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t2.sh && git checkout HEAD -- t2106\*) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	06d531486e	t[01]: adjust the references to the default branch name "main" Carefully excluding t1309, which sees independent development elsewhere at the time of writing, we transition above-mentioned tests to the default branch name `main`. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -e 's/naster/nain/g' -- t[01].sh && git checkout HEAD -- t1309\*) Note that t5533 contains a variation of the name `master` (`naster`) that we rename here, too. This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	c2fdc8820c	t0060: preemptively adjust alignment We are about to adjust t0060 for the new default branch name `main`. This name is two characters shorter and therefore needs two spaces more padding to align correctly. Adjusting the alignment before the big search-and-replace makes it easier to verify that the final result does not leave any misaligned lines behind. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:17 -08:00
Johannes Schindelin	334afbc76f	tests: mark tests relying on the current default for `init.defaultBranch` In addition to the manual adjustment to let the `linux-gcc` CI job run the test suite with `master` and then with `main`, this patch makes sure that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts that currently rely on the initial branch name being `master by default. To determine which test scripts to mark up, the first step was to force-set the default branch name to `master` in - all test scripts that contain the keyword `master`, - t4211, which expects `t/t4211/history.export` with a hard-coded ref to initialize the default branch, - t5560 because it sources `t/t556x_common` which uses `master`, - t8002 and t8012 because both source `t/annotate-tests.sh` which also uses `master`) This trick was performed by this command: $ sed -i '/^ \. \.\/$test-lib\\|lib-\(bash\\|cvs\\|git-svn$\\|gitweb-lib\)\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' $(git grep -l master t/t[0-9].sh) \ t/t4211.sh t/t5560.sh t/t8002.sh t/t8012.sh After that, careful, manual inspection revealed that some of the test scripts containing the needle `master` do not actually rely on a specific default branch name: either they mention `master` only in a comment, or they initialize that branch specificially, or they do not actually refer to the current default branch. Therefore, the aforementioned modification was undone in those test scripts thusly: $ git checkout HEAD -- \ t/t0027-auto-crlf.sh t/t0060-path-utils.sh \ t/t1011-read-tree-sparse-checkout.sh \ t/t1305-config-include.sh t/t1309-early-config.sh \ t/t1402-check-ref-format.sh t/t1450-fsck.sh \ t/t2024-checkout-dwim.sh \ t/t2106-update-index-assume-unchanged.sh \ t/t3040-subprojects-basic.sh t/t3301-notes.sh \ t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \ t/t3436-rebase-more-options.sh \ t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \ t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \ t/t5511-refspec.sh t/t5526-fetch-submodules.sh \ t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \ t/t5548-push-porcelain.sh \ t/t5552-skipping-fetch-negotiator.sh \ t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \ t/t5614-clone-submodules-shallow.sh \ t/t7508-status.sh t/t7606-merge-custom.sh \ t/t9302-fast-import-unpack-limit.sh We excluded one set of test scripts in these commands, though: the range of `git p4` tests. The reason? `git p4` stores the (foreign) remote branch in the branch called `p4/master`, which is obviously not the default branch. Manual analysis revealed that only five of these tests actually require a specific default branch name to pass; They were modified thusly: $ sed -i '/^ \. \.\/lib-git-p4\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' t/t980[0167].sh t/t9811*.sh Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:17 -08:00
Junio C Hamano	fced6d171e	Merge 'jk/diff-release-filespec-fix' into js/default-branch-name-tests-final-stretch * jk/diff-release-filespec-fix: t7800: simplify difftool test diff: allow passing NULL to diff_free_filespec_data()	2020-11-19 15:27:59 -08:00
Junio C Hamano	6d37ca2165	Merge branch 'en/strmap' into en/merge-ort-impl * en/strmap: shortlog: use strset from strmap.h Use new HASHMAP_INIT macro to simplify hashmap initialization strmap: take advantage of FLEXPTR_ALLOC_STR when relevant strmap: enable allocations to come from a mem_pool strmap: add a strset sub-type strmap: split create_entry() out of strmap_put() strmap: add functions facilitating use as a string->int map strmap: enable faster clearing and reusing of strmaps strmap: add more utility functions strmap: new utility functions hashmap: provide deallocation function names hashmap: introduce a new hashmap_partial_clear() hashmap: allow re-use after hashmap_free() hashmap: adjust spacing to fix argument alignment hashmap: add usage documentation explaining hashmap_free[_entries]()	2020-11-11 12:56:29 -08:00

1084 changed files with 119839 additions and 59702 deletions

1

.gitattributes vendored

View File

 @ -6,6 +6,7 @@
 *.pm eol=lf diff=perl
 *.py eol=lf diff=python
 *.bat eol=crlf
 CODE_OF_CONDUCT.md -whitespace
 /Documentation/**/*.txt eol=lf
 /command-list.txt eol=lf
 /GIT-VERSION-GEN eol=lf

									
										7

.github/workflows/main.yml
									
										vendored
									
												View File
												
				@ -186,6 +186,11 @@ jobs:

				        ## Unzip and remove the artifact

				        unzip artifacts.zip

				        rm artifacts.zip

				    - name: initialize vcpkg

				      uses: actions/checkout@v2

				      with:

				        repository: 'microsoft/vcpkg'

				        path: 'compat/vcbuild/vcpkg'

				    - name: download vcpkg artifacts

				      shell: powershell

				      run: |

				@ -289,7 +294,7 @@ jobs:

				          - jobname: osx-gcc

				            cc: gcc

				            pool: macos-latest

				          - jobname: GETTEXT_POISON

				          - jobname: linux-gcc-default

				            cc: gcc

				            pool: ubuntu-latest

				    env:

2

.gitignore vendored

View File

 @ -33,6 +33,7 @@
 /git-check-mailmap
 /git-check-ref-format
 /git-checkout
 /git-checkout--worker
 /git-checkout-index
 /git-cherry
 /git-cherry-pick
 @ -162,6 +163,7 @@
 /git-stripspace
 /git-submodule
 /git-submodule--helper
 /git-subtree
 /git-svn
 /git-switch
 /git-symbolic-ref

1

.mailmap

View File

 @ -220,6 +220,7 @@ Philipp A. Hartmann <pah@qo.cx> <ph@sorgh.de>
 Philippe Bruhat <book@cpan.org>
 Ralf Thielow <ralf.thielow@gmail.com> <ralf.thielow@googlemail.com>
 Ramsay Jones <ramsay@ramsayjones.plus.com> <ramsay@ramsay1.demon.co.uk>
 Ramkumar Ramachandra <r@artagnon.com> <artagnon@gmail.com>
 Randall S. Becker <randall.becker@nexbridge.ca> <rsbecker@nexbridge.com>
 René Scharfe <l.s.r@web.de> <rene.scharfe@lsrfire.ath.cx>
 René Scharfe <l.s.r@web.de> Rene Scharfe

									
										2

.travis.yml
									
												View File
												
				@ -16,7 +16,7 @@ compiler:

				matrix:

				  include:

				    - env: jobname=GETTEXT_POISON

				    - env: jobname=linux-gcc-default

				      os: linux

				      compiler:

				      addons:

									
										154

CODE_OF_CONDUCT.md
									
												View File
												
				@ -8,73 +8,64 @@ this code of conduct may be banned from the community.

				## Our Pledge

				In the interest of fostering an open and welcoming environment, we as

				contributors and maintainers pledge to make participation in our project and

				our community a harassment-free experience for everyone, regardless of age,

				body size, disability, ethnicity, sex characteristics, gender identity and

				expression, level of experience, education, socio-economic status,

				nationality, personal appearance, race, religion, or sexual identity and

				orientation.

				We as members, contributors, and leaders pledge to make participation in our

				community a harassment-free experience for everyone, regardless of age, body

				size, visible or invisible disability, ethnicity, sex characteristics, gender

				identity and expression, level of experience, education, socio-economic status,

				nationality, personal appearance, race, religion, or sexual identity

				and orientation.

				We pledge to act and interact in ways that contribute to an open, welcoming,

				diverse, inclusive, and healthy community.

				## Our Standards

				Examples of behavior that contributes to creating a positive environment

				include:

				Examples of behavior that contributes to a positive environment for our

				community include:

				* Using welcoming and inclusive language

				* Being respectful of differing viewpoints and experiences

				* Gracefully accepting constructive criticism

				* Focusing on what is best for the community

				* Showing empathy towards other community members

				* Demonstrating empathy and kindness toward other people

				* Being respectful of differing opinions, viewpoints, and experiences

				* Giving and gracefully accepting constructive feedback

				* Accepting responsibility and apologizing to those affected by our mistakes,

				  and learning from the experience

				* Focusing on what is best not just for us as individuals, but for the

				  overall community

				Examples of unacceptable behavior by participants include:

				Examples of unacceptable behavior include:

				* The use of sexualized language or imagery and unwelcome sexual attention or

				  advances

				* Trolling, insulting/derogatory comments, and personal or political attacks

				* The use of sexualized language or imagery, and sexual attention or

				  advances of any kind

				* Trolling, insulting or derogatory comments, and personal or political attacks

				* Public or private harassment

				* Publishing others' private information, such as a physical or electronic

				  address, without explicit permission

				* Publishing others' private information, such as a physical or email

				  address, without their explicit permission

				* Other conduct which could reasonably be considered inappropriate in a

				  professional setting

				## Our Responsibilities

				## Enforcement Responsibilities

				Project maintainers are responsible for clarifying the standards of acceptable

				behavior and are expected to take appropriate and fair corrective action in

				response to any instances of unacceptable behavior.

				Community leaders are responsible for clarifying and enforcing our standards of

				acceptable behavior and will take appropriate and fair corrective action in

				response to any behavior that they deem inappropriate, threatening, offensive,

				or harmful.

				Project maintainers have the right and responsibility to remove, edit, or

				reject comments, commits, code, wiki edits, issues, and other contributions

				that are not aligned to this Code of Conduct, or to ban temporarily or

				permanently any contributor for other behaviors that they deem inappropriate,

				threatening, offensive, or harmful.

				Community leaders have the right and responsibility to remove, edit, or reject

				comments, commits, code, wiki edits, issues, and other contributions that are

				not aligned to this Code of Conduct, and will communicate reasons for moderation

				decisions when appropriate.

				## Scope

				This Code of Conduct applies within all project spaces, and it also applies

				when an individual is representing the project or its community in public

				spaces. Examples of representing a project or community include using an

				official project e-mail address, posting via an official social media account,

				or acting as an appointed representative at an online or offline event.

				Representation of a project may be further defined and clarified by project

				maintainers.

				This Code of Conduct applies within all community spaces, and also applies when

				an individual is officially representing the community in public spaces.

				Examples of representing our community include using an official e-mail address,

				posting via an official social media account, or acting as an appointed

				representative at an online or offline event.

				## Enforcement

				Instances of abusive, harassing, or otherwise unacceptable behavior may be

				reported by contacting the project team at git@sfconservancy.org. All

				complaints will be reviewed and investigated and will result in a response

				that is deemed necessary and appropriate to the circumstances. The project

				team is obligated to maintain confidentiality with regard to the reporter of

				an incident. Further details of specific enforcement policies may be posted

				separately.

				Project maintainers who do not follow or enforce the Code of Conduct in good

				faith may face temporary or permanent repercussions as determined by other

				members of the project's leadership.

				The project leadership team can be contacted by email as a whole at

				reported to the community leaders responsible for enforcement at

				git@sfconservancy.org, or individually:

				  - Ævar Arnfjörð Bjarmason <avarab@gmail.com>

				@ -82,12 +73,73 @@ git@sfconservancy.org, or individually:

				  - Jeff King <peff@peff.net>

				  - Junio C Hamano <gitster@pobox.com>

				All complaints will be reviewed and investigated promptly and fairly.

				All community leaders are obligated to respect the privacy and security of the

				reporter of any incident.

				## Enforcement Guidelines

				Community leaders will follow these Community Impact Guidelines in determining

				the consequences for any action they deem in violation of this Code of Conduct:

				### 1. Correction

				**Community Impact**: Use of inappropriate language or other behavior deemed

				unprofessional or unwelcome in the community.

				**Consequence**: A private, written warning from community leaders, providing

				clarity around the nature of the violation and an explanation of why the

				behavior was inappropriate. A public apology may be requested.

				### 2. Warning

				**Community Impact**: A violation through a single incident or series

				of actions.

				**Consequence**: A warning with consequences for continued behavior. No

				interaction with the people involved, including unsolicited interaction with

				those enforcing the Code of Conduct, for a specified period of time. This

				includes avoiding interactions in community spaces as well as external channels

				like social media. Violating these terms may lead to a temporary or

				permanent ban.

				### 3. Temporary Ban

				**Community Impact**: A serious violation of community standards, including

				sustained inappropriate behavior.

				**Consequence**: A temporary ban from any sort of interaction or public

				communication with the community for a specified period of time. No public or

				private interaction with the people involved, including unsolicited interaction

				with those enforcing the Code of Conduct, is allowed during this period.

				Violating these terms may lead to a permanent ban.

				### 4. Permanent Ban

				**Community Impact**: Demonstrating a pattern of violation of community

				standards, including sustained inappropriate behavior,  harassment of an

				individual, or aggression toward or disparagement of classes of individuals.

				**Consequence**: A permanent ban from any sort of public interaction within

				the community.

				## Attribution

				This Code of Conduct is adapted from the [Contributor Covenant][homepage],

				version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

				version 2.0, available at

				[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].

				Community Impact Guidelines were inspired by 

				[Mozilla's code of conduct enforcement ladder][Mozilla CoC].

				For answers to common questions about this code of conduct, see the FAQ at

				[https://www.contributor-covenant.org/faq][FAQ]. Translations are available 

				at [https://www.contributor-covenant.org/translations][translations].

				[homepage]: https://www.contributor-covenant.org

				[v2.0]: https://www.contributor-covenant.org/version/2/0/code_of_conduct.html

				[Mozilla CoC]: https://github.com/mozilla/diversity

				[FAQ]: https://www.contributor-covenant.org/faq

				[translations]: https://www.contributor-covenant.org/translations

				For answers to common questions about this code of conduct, see

				https://www.contributor-covenant.org/faq

12

Documentation/CodingGuidelines

View File

 @ -175,6 +175,11 @@ For shell scripts specifically (not exhaustive):
    does not have such a problem.
  - Even though "local" is not part of POSIX, we make heavy use of it
    in our test suite.  We do not use it in scripted Porcelains, and
    hopefully nobody starts using "local" before they are reimplemented
    in C ;-)
 For C programs:
 @ -498,7 +503,12 @@ Error Messages
  - Do not end error messages with a full stop.
  - Do not capitalize ("unable to open %s", not "Unable to open %s")
  - Do not capitalize the first word, only because it is the first word
    in the message ("unable to open %s", not "Unable to open %s").  But
    "SHA-3 not supported" is fine, because the reason the first word is
    capitalized is not because it is at the beginning of the sentence,
    but because the word would be spelled in capital letters even when
    it appeared in the middle of the sentence.
  - Say what the error is first ("cannot open %s", not "%s: cannot open")

									
										26

Documentation/Makefile
									
												View File
												
				@ -2,6 +2,8 @@

				MAN1_TXT =

				MAN5_TXT =

				MAN7_TXT =

				HOWTO_TXT =

				DOC_DEP_TXT =

				TECH_DOCS =

				ARTICLES =

				SP_ARTICLES =

				@ -21,6 +23,7 @@ MAN1_TXT += gitweb.txt

				MAN5_TXT += gitattributes.txt

				MAN5_TXT += githooks.txt

				MAN5_TXT += gitignore.txt

				MAN5_TXT += gitmailmap.txt

				MAN5_TXT += gitmodules.txt

				MAN5_TXT += gitrepository-layout.txt

				MAN5_TXT += gitweb.conf.txt

				@ -41,6 +44,11 @@ MAN7_TXT += gittutorial-2.txt

				MAN7_TXT += gittutorial.txt

				MAN7_TXT += gitworkflows.txt

				HOWTO_TXT += $(wildcard howto/*.txt)

				DOC_DEP_TXT += $(wildcard *.txt)

				DOC_DEP_TXT += $(wildcard config/*.txt)

				ifdef MAN_FILTER

				MAN_TXT = $(filter $(MAN_FILTER),$(MAN1_TXT) $(MAN5_TXT) $(MAN7_TXT))

				else

				@ -75,6 +83,7 @@ SP_ARTICLES += howto/rebuild-from-update-hook

				SP_ARTICLES += howto/rebase-from-internal-branch

				SP_ARTICLES += howto/keep-canonical-history-correct

				SP_ARTICLES += howto/maintain-git

				SP_ARTICLES += howto/coordinate-embargoed-releases

				API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt)))

				SP_ARTICLES += $(API_DOCS)

				@ -89,6 +98,7 @@ TECH_DOCS += technical/multi-pack-index

				TECH_DOCS += technical/pack-format

				TECH_DOCS += technical/pack-heuristics

				TECH_DOCS += technical/pack-protocol

				TECH_DOCS += technical/parallel-checkout

				TECH_DOCS += technical/partial-clone

				TECH_DOCS += technical/protocol-capabilities

				TECH_DOCS += technical/protocol-common

				@ -283,7 +293,7 @@ docdep_prereqs = \

					mergetools-list.made $(mergetools_txt) \

					cmd-list.made $(cmds_txt)

				doc.dep : $(docdep_prereqs) $(wildcard *.txt) $(wildcard config/*.txt) build-docdep.perl

				doc.dep : $(docdep_prereqs) $(DOC_DEP_TXT) build-docdep.perl

					$(QUIET_GEN)$(RM) $@+ $@ && \

					$(PERL_PATH) ./build-docdep.perl >$@+ $(QUIET_STDERR) && \

					mv $@+ $@

				@ -426,9 +436,9 @@ $(patsubst %.txt,%.texi,$(MAN_TXT)): %.texi : %.xml

					$(DOCBOOK2X_TEXI) --to-stdout $*.xml >$@+ && \

					mv $@+ $@

				howto-index.txt: howto-index.sh $(wildcard howto/*.txt)

				howto-index.txt: howto-index.sh $(HOWTO_TXT)

					$(QUIET_GEN)$(RM) $@+ $@ && \

					'$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(wildcard howto/*.txt)) >$@+ && \

					'$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(HOWTO_TXT)) >$@+ && \

					mv $@+ $@

				$(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt

				@ -437,7 +447,7 @@ $(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt

				WEBDOC_DEST = /pub/software/scm/git/docs

				howto/%.html: ASCIIDOC_EXTRA += -a git-relative-html-prefix=../

				$(patsubst %.txt,%.html,$(wildcard howto/*.txt)): %.html : %.txt GIT-ASCIIDOCFLAGS

				$(patsubst %.txt,%.html,$(HOWTO_TXT)): %.html : %.txt GIT-ASCIIDOCFLAGS

					$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \

					sed -e '1,/^$$/d' $< | \

					$(TXT_TO_HTML) - >$@+ && \

				@ -469,7 +479,13 @@ print-man1:

					@for i in $(MAN1_TXT); do echo $$i; done

				lint-docs::

					$(QUIET_LINT)$(PERL_PATH) lint-gitlink.perl

					$(QUIET_LINT)$(PERL_PATH) lint-gitlink.perl \

						$(HOWTO_TXT) $(DOC_DEP_TXT) \

						--section=1 $(MAN1_TXT) \

						--section=5 $(MAN5_TXT) \

						--section=7 $(MAN7_TXT); \

					$(PERL_PATH) lint-man-end-blurb.perl $(MAN_TXT); \

					$(PERL_PATH) lint-man-section-order.perl $(MAN_TXT);

				ifeq ($(wildcard po/Makefile),po/Makefile)

				doc-l10n install-l10n::

2

Documentation/MyFirstContribution.txt

View File

 @ -664,7 +664,7 @@ mention the right animal somewhere:
 ----
 test_expect_success 'runs correctly with no args and good output' '
 	git psuh >actual &&
 	test_i18ngrep Pony actual
 	grep Pony actual
 '
 ----

24

Documentation/RelNotes/2.30.3.txt

View File

 @ -1,24 +0,0 @@
 Git v2.30.2 Release Notes
 =========================
 This release addresses the security issue CVE-2022-24765.
 Fixes since v2.30.2
 -------------------
  * Build fix on Windows.
  * Fix `GIT_CEILING_DIRECTORIES` with Windows-style root directories.
  * CVE-2022-24765:
    On multi-user machines, Git users might find themselves
    unexpectedly in a Git worktree, e.g. when another user created a
    repository in `C:\.git`, in a mounted network drive or in a
    scratch space. Merely having a Git-aware prompt that runs `git
    status` (or `git diff`) and navigating to a directory which is
    supposedly not a Git worktree, or opening such a directory in an
    editor or IDE such as VS Code or Atom, will potentially run
    commands defined by that other user.
 Credit for finding this vulnerability goes to 俞晨东; The fix was
 authored by Johannes Schindelin.

21

Documentation/RelNotes/2.30.4.txt

View File

 @ -1,21 +0,0 @@
 Git v2.30.4 Release Notes
 =========================
 This release contains minor fix-ups for the changes that went into
 Git 2.30.3, which was made to address CVE-2022-24765.
  * The code that was meant to parse the new `safe.directory`
    configuration variable was not checking what configuration
    variable was being fed to it, which has been corrected.
  * '*' can be used as the value for the `safe.directory` variable to
    signal that the user considers that any directory is safe.
 Derrick Stolee (2):
       t0033: add tests for safe.directory
       setup: opt-out of check with safe.directory=*
 Matheus Valadares (1):
       setup: fix safe.directory key not being checked

12

Documentation/RelNotes/2.30.5.txt

View File

 @ -1,12 +0,0 @@
 Git v2.30.5 Release Notes
 =========================
 This release contains minor fix-ups for the changes that went into
 Git 2.30.3 and 2.30.4, addressing CVE-2022-29187.
  * The safety check that verifies a safe ownership of the Git
    worktree is now extended to also cover the ownership of the Git
    directory (and the `.git` file, if there is any).
 Carlo Marcelo Arenas Belón (1):
       setup: tighten ownership checks post CVE-2022-24765

60

Documentation/RelNotes/2.30.6.txt

View File

 @ -1,60 +0,0 @@
 Git v2.30.6 Release Notes
 =========================
 This release addresses the security issues CVE-2022-39253 and
 CVE-2022-39260.
 Fixes since v2.30.5
 -------------------
  * CVE-2022-39253:
    When relying on the `--local` clone optimization, Git dereferences
    symbolic links in the source repository before creating hardlinks
    (or copies) of the dereferenced link in the destination repository.
    This can lead to surprising behavior where arbitrary files are
    present in a repository's `$GIT_DIR` when cloning from a malicious
    repository.
    Git will no longer dereference symbolic links via the `--local`
    clone mechanism, and will instead refuse to clone repositories that
    have symbolic links present in the `$GIT_DIR/objects` directory.
    Additionally, the value of `protocol.file.allow` is changed to be
    "user" by default.
  * CVE-2022-39260:
    An overly-long command string given to `git shell` can result in
    overflow in `split_cmdline()`, leading to arbitrary heap writes and
    remote code execution when `git shell` is exposed and the directory
    `$HOME/git-shell-commands` exists.
    `git shell` is taught to refuse interactive commands that are
    longer than 4MiB in size. `split_cmdline()` is hardened to reject
    inputs larger than 2GiB.
 Credit for finding CVE-2022-39253 goes to Cory Snider of Mirantis. The
 fix was authored by Taylor Blau, with help from Johannes Schindelin.
 Credit for finding CVE-2022-39260 goes to Kevin Backhouse of GitHub.
 The fix was authored by Kevin Backhouse, Jeff King, and Taylor Blau.
 Jeff King (2):
       shell: add basic tests
       shell: limit size of interactive commands
 Kevin Backhouse (1):
       alias.c: reject too-long cmdline strings in split_cmdline()
 Taylor Blau (11):
       builtin/clone.c: disallow `--local` clones with symlinks
       t/lib-submodule-update.sh: allow local submodules
       t/t1NNN: allow local submodules
       t/2NNNN: allow local submodules
       t/t3NNN: allow local submodules
       t/t4NNN: allow local submodules
       t/t5NNN: allow local submodules
       t/t6NNN: allow local submodules
       t/t7NNN: allow local submodules
       t/t9NNN: allow local submodules
       transport: make `protocol.file.allow` be "user" by default

86

Documentation/RelNotes/2.30.7.txt

View File

 @ -1,86 +0,0 @@
 Git v2.30.7 Release Notes
 =========================
 This release addresses the security issues CVE-2022-41903 and
 CVE-2022-23521.
 Fixes since v2.30.6
 -------------------
  * CVE-2022-41903:
    git log has the ability to display commits using an arbitrary
    format with its --format specifiers. This functionality is also
    exposed to git archive via the export-subst gitattribute.
    When processing the padding operators (e.g., %<(, %<|(, %>(,
    %>>(, or %><( ), an integer overflow can occur in
    pretty.c::format_and_pad_commit() where a size_t is improperly
    stored as an int, and then added as an offset to a subsequent
    memcpy() call.
    This overflow can be triggered directly by a user running a
    command which invokes the commit formatting machinery (e.g., git
    log --format=...). It may also be triggered indirectly through
    git archive via the export-subst mechanism, which expands format
    specifiers inside of files within the repository during a git
    archive.
    This integer overflow can result in arbitrary heap writes, which
    may result in remote code execution.
 * CVE-2022-23521:
     gitattributes are a mechanism to allow defining attributes for
     paths. These attributes can be defined by adding a `.gitattributes`
     file to the repository, which contains a set of file patterns and
     the attributes that should be set for paths matching this pattern.
     When parsing gitattributes, multiple integer overflows can occur
     when there is a huge number of path patterns, a huge number of
     attributes for a single pattern, or when the declared attribute
     names are huge.
     These overflows can be triggered via a crafted `.gitattributes` file
     that may be part of the commit history. Git silently splits lines
     longer than 2KB when parsing gitattributes from a file, but not when
     parsing them from the index. Consequentially, the failure mode
     depends on whether the file exists in the working tree, the index or
     both.
     This integer overflow can result in arbitrary heap reads and writes,
     which may result in remote code execution.
 Credit for finding CVE-2022-41903 goes to Joern Schneeweisz of GitLab.
 An initial fix was authored by Markus Vervier of X41 D-Sec. Credit for
 finding CVE-2022-23521 goes to Markus Vervier and Eric Sesterhenn of X41
 D-Sec. This work was sponsored by OSTIF.
 The proposed fixes have been polished and extended to cover additional
 findings by Patrick Steinhardt of GitLab, with help from others on the
 Git security mailing list.
 Patrick Steinhardt (21):
       attr: fix overflow when upserting attribute with overly long name
       attr: fix out-of-bounds read with huge attribute names
       attr: fix integer overflow when parsing huge attribute names
       attr: fix out-of-bounds write when parsing huge number of attributes
       attr: fix out-of-bounds read with unreasonable amount of patterns
       attr: fix integer overflow with more than INT_MAX macros
       attr: harden allocation against integer overflows
       attr: fix silently splitting up lines longer than 2048 bytes
       attr: ignore attribute lines exceeding 2048 bytes
       attr: ignore overly large gitattributes files
       pretty: fix out-of-bounds write caused by integer overflow
       pretty: fix out-of-bounds read when left-flushing with stealing
       pretty: fix out-of-bounds read when parsing invalid padding format
       pretty: fix adding linefeed when placeholder is not expanded
       pretty: fix integer overflow in wrapping format
       utf8: fix truncated string lengths in `utf8_strnwidth()`
       utf8: fix returning negative string width
       utf8: fix overflow when returning string width
       utf8: fix checking for glyph width in `strbuf_utf8_replace()`
       utf8: refactor `strbuf_utf8_replace` to not rely on preallocated buffer
       pretty: restrict input lengths for padding and wrapping formats

52

Documentation/RelNotes/2.30.8.txt

View File

 @ -1,52 +0,0 @@
 Git v2.30.8 Release Notes
 =========================
 This release addresses the security issues CVE-2023-22490 and
 CVE-2023-23946.
 Fixes since v2.30.7
 -------------------
  * CVE-2023-22490:
    Using a specially-crafted repository, Git can be tricked into using
    its local clone optimization even when using a non-local transport.
    Though Git will abort local clones whose source $GIT_DIR/objects
    directory contains symbolic links (c.f., CVE-2022-39253), the objects
    directory itself may still be a symbolic link.
    These two may be combined to include arbitrary files based on known
    paths on the victim's filesystem within the malicious repository's
    working copy, allowing for data exfiltration in a similar manner as
    CVE-2022-39253.
  * CVE-2023-23946:
    By feeding a crafted input to "git apply", a path outside the
    working tree can be overwritten as the user who is running "git
    apply".
  * A mismatched type in `attr.c::read_attr_from_index()` which could
    cause Git to errantly reject attributes on Windows and 32-bit Linux
    has been corrected.
 Credit for finding CVE-2023-22490 goes to yvvdwf, and the fix was
 developed by Taylor Blau, with additional help from others on the
 Git security mailing list.
 Credit for finding CVE-2023-23946 goes to Joern Schneeweisz, and the
 fix was developed by Patrick Steinhardt.
 Johannes Schindelin (1):
       attr: adjust a mismatched data type
 Patrick Steinhardt (1):
       apply: fix writing behind newly created symbolic links
 Taylor Blau (3):
       t5619: demonstrate clone_local() with ambiguous transport
       clone: delay picking a transport until after get_repo_path()
       dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS

365

Documentation/RelNotes/2.31.0.txt Normal file

View File

 @ -0,0 +1,365 @@
 Git 2.31 Release Notes
 ======================
 Updates since v2.30
 -------------------
 Backward incompatible and other important changes
  * The "pack-redundant" command, which has been left stale with almost
    unusable performance issues, now warns loudly when it gets used, as
    we no longer want to recommend its use (instead just "repack -d"
    instead).
  * The development community has adopted Contributor Covenant v2.0 to
    update from v1.4 that we have been using.
  * The support for deprecated PCRE1 library has been dropped.
  * Fixes for CVE-2021-21300 in Git 2.30.2 (and earlier) is included.
 UI, Workflows & Features
  * The "--format=%(trailers)" mechanism gets enhanced to make it
    easier to design output for machine consumption.
  * When a user does not tell "git pull" to use rebase or merge, the
    command gives a loud message telling a user to choose between
    rebase or merge but creates a merge anyway, forcing users who would
    want to rebase to redo the operation.  Fix an early part of this
    problem by tightening the condition to give the message---there is
    no reason to stop or force the user to choose between rebase or
    merge if the history fast-forwards.
  * The configuration variable 'core.abbrev' can be set to 'no' to
    force no abbreviation regardless of the hash algorithm.
  * "git rev-parse" can be explicitly told to give output as absolute
    or relative path with the `--path-format=(absolute|relative)` option.
  * Bash completion (in contrib/) update to make it easier for
    end-users to add completion for their custom "git" subcommands.
  * "git maintenance" learned to drive scheduled maintenance on
    platforms whose native scheduling methods are not 'cron'.
  * After expiring a reflog and making a single commit, the reflog for
    the branch would record a single entry that knows both @{0} and
    @{1}, but we failed to answer "what commit were we on?", i.e. @{1}
  * "git bundle" learns "--stdin" option to read its refs from the
    standard input.  Also, it now does not lose refs whey they point
    at the same object.
  * "git log" learned a new "--diff-merges=<how>" option.
  * "git ls-files" can and does show multiple entries when the index is
    unmerged, which is a source for confusion unless -s/-u option is in
    use.  A new option --deduplicate has been introduced.
  * `git worktree list` now annotates worktrees as prunable, shows
    locked and prunable attributes in --porcelain mode, and gained
    a --verbose option.
  * "git clone" tries to locally check out the branch pointed at by
    HEAD of the remote repository after it is done, but the protocol
    did not convey the information necessary to do so when copying an
    empty repository.  The protocol v2 learned how to do so.
  * There are other ways than ".." for a single token to denote a
    "commit range", namely "<rev>^!" and "<rev>^-<n>", but "git
    range-diff" did not understand them.
  * The "git range-diff" command learned "--(left|right)-only" option
    to show only one side of the compared range.
  * "git mergetool" feeds three versions (base, local and remote) of
    a conflicted path unmodified.  The command learned to optionally
    prepare these files with unconflicted parts already resolved.
  * The .mailmap is documented to be read only from the root level of a
    working tree, but a stray file in a bare repository also was read
    by accident, which has been corrected.
  * "git maintenance" tool learned a new "pack-refs" maintenance task.
  * The error message given when a configuration variable that is
    expected to have a boolean value has been improved.
  * Signed commits and tags now allow verification of objects, whose
    two object names (one in SHA-1, the other in SHA-256) are both
    signed.
  * "git rev-list" command learned "--disk-usage" option.
  * "git {diff,log} --{skip,rotate}-to=<path>" allows the user to
    discard diff output for early paths or move them to the end of the
    output.
  * "git difftool" learned "--skip-to=<path>" option to restart an
    interrupted session from an arbitrary path.
  * "git grep" has been tweaked to be limited to the sparse checkout
    paths.
  * "git rebase --[no-]fork-point" gained a configuration variable
    rebase.forkPoint so that users do not have to keep specifying a
    non-default setting.
 Performance, Internal Implementation, Development Support etc.
  * A 3-year old test that was not testing anything useful has been
    corrected.
  * Retire more names with "sha1" in it.
  * The topological walk codepath is covered by new trace2 stats.
  * Update the Code-of-conduct to version 2.0 from the upstream (we've
    been using version 1.4).
  * "git mktag" validates its input using its own rules before writing
    a tag object---it has been updated to share the logic with "git
    fsck".
  * Two new ways to feed configuration variable-value pairs via
    environment variables have been introduced, and the way
    GIT_CONFIG_PARAMETERS encodes variable/value pairs has been tweaked
    to make it more robust.
  * Tests have been updated so that they do not to get affected by the
    name of the default branch "git init" creates.
  * "git fetch" learns to treat ref updates atomically in all-or-none
    fashion, just like "git push" does, with the new "--atomic" option.
  * The peel_ref() API has been replaced with peel_iterated_oid().
  * The .use_shell flag in struct child_process that is passed to
    run_command() API has been clarified with a bit more documentation.
  * Document, clean-up and optimize the code around the cache-tree
    extension in the index.
  * The ls-refs protocol operation has been optimized to narrow the
    sub-hierarchy of refs/ it walks to produce response.
  * When removing many branches and tags, the code used to do so one
    ref at a time.  There is another API it can use to delete multiple
    refs, and it makes quite a lot of performance difference when the
    refs are packed.
  * The "pack-objects" command needs to iterate over all the tags when
    automatic tag following is enabled, but it actually iterated over
    all refs and then discarded everything outside "refs/tags/"
    hierarchy, which was quite wasteful.
  * A perf script was made more portable.
  * Our setting of GitHub CI test jobs were a bit too eager to give up
    once there is even one failure found.  Tweak the knob to allow
    other jobs keep running even when we see a failure, so that we can
    find more failures in a single run.
  * We've carried compatibility codepaths for compilers without
    variadic macros for quite some time, but the world may be ready for
    them to be removed.  Force compilation failure on exotic platforms
    where variadic macros are not available to find out who screams in
    such a way that we can easily revert if it turns out that the world
    is not yet ready.
  * Code clean-up to ensure our use of hashtables using object names as
    keys use the "struct object_id" objects, not the raw hash values.
  * Lose the debugging aid that may have been useful in the past, but
    no longer is, in the "grep" codepaths.
  * Some pretty-format specifiers do not need the data in commit object
    (e.g. "%H"), but we were over-eager to load and parse it, which has
    been made even lazier.
  * Get rid of "GETTEXT_POISON" support altogether, which may or may
    not be controversial.
  * Introduce an on-disk file to record revindex for packdata, which
    traditionally was always created on the fly and only in-core.
  * The commit-graph learned to use corrected commit dates instead of
    the generation number to help topological revision traversal.
  * Piecemeal of rewrite of "git bisect" in C continues.
  * When a pager spawned by us exited, the trace log did not record its
    exit status correctly, which has been corrected.
  * Removal of GIT_TEST_GETTEXT_POISON continues.
  * The code to implement "git merge-base --independent" was poorly
    done and was kept from the very beginning of the feature.
  * Preliminary changes to fsmonitor integration.
  * Performance improvements for rename detection.
  * The common code to deal with "chunked file format" that is shared
    by the multi-pack-index and commit-graph files have been factored
    out, to help codepaths for both filetypes to become more robust.
  * The approach to "fsck" the incoming objects in "index-pack" is
    attractive for performance reasons (we have them already in core,
    inflated and ready to be inspected), but fundamentally cannot be
    applied fully when we receive more than one pack stream, as a tree
    object in one pack may refer to a blob object in another pack as
    ".gitmodules", when we want to inspect blobs that are used as
    ".gitmodules" file, for example.  Teach "index-pack" to emit
    objects that must be inspected later and check them in the calling
    "fetch-pack" process.
  * The logic to handle "trailer" related placeholders in the
    "--format=" mechanisms in the "log" family and "for-each-ref"
    family is getting unified.
  * Raise the buffer size used when writing the index file out from
    (obviously too small) 8kB to (clearly sufficiently large) 128kB.
  * It is reported that open() on some platforms (e.g. macOS Big Sur)
    can return EINTR even though our timers are set up with SA_RESTART.
    A workaround has been implemented and enabled for macOS to rerun
    open() transparently from the caller when this happens.
 Fixes since v2.30
 -----------------
  * Diagnose command line error of "git rebase" early.
  * Clean up option descriptions in "git cmd --help".
  * "git stash" did not work well in a sparsely checked out working
    tree.
  * Some tests expect that "ls -l" output has either '-' or 'x' for
    group executable bit, but setgid bit can be inherited from parent
    directory and make these fields 'S' or 's' instead, causing test
    failures.
  * "git for-each-repo --config=<var> <cmd>" should not run <cmd> for
    any repository when the configuration variable <var> is not defined
    even once.
  * Fix 2.29 regression where "git mergetool --tool-help" fails to list
    all the available tools.
  * Fix for procedure to building CI test environment for mac.
  * The implementation of "git branch --sort" wrt the detached HEAD
    display has always been hacky, which has been cleaned up.
  * Newline characters in the host and path part of git:// URL are
    now forbidden.
  * "git diff" showed a submodule working tree with untracked cruft as
    "Submodule commit <objectname>-dirty", but a natural expectation is
    that the "-dirty" indicator would align with "git describe --dirty",
    which does not consider having untracked files in the working tree
    as source of dirtiness.  The inconsistency has been fixed.
  * When more than one commit with the same patch ID appears on one
    side, "git log --cherry-pick A...B" did not exclude them all when a
    commit with the same patch ID appears on the other side.  Now it
    does.
  * Documentation for "git fsck" lost stale bits that has become
    incorrect.
  * Doc fix for packfile URI feature.
  * When "git rebase -i" processes "fixup" insn, there is no reason to
    clean up the commit log message, but we did the usual stripspace
    processing.  This has been corrected.
    (merge f7d42ceec5 js/rebase-i-commit-cleanup-fix later to maint).
  * Fix in passing custom args from "git clone" to "upload-pack" on the
    other side.
    (merge ad6b5fefbd jv/upload-pack-filter-spec-quotefix later to maint).
  * The command line completion (in contrib/) completed "git branch -d"
    with branch names, but "git branch -D" offered tagnames in addition,
    which has been corrected.  "git branch -M" had the same problem.
    (merge 27dc071b9a jk/complete-branch-force-delete later to maint).
  * When commands are started from a subdirectory, they may have to
    compare the path to the subdirectory (called prefix and found out
    from $(pwd)) with the tracked paths.  On macOS, $(pwd) and
    readdir() yield decomposed path, while the tracked paths are
    usually normalized to the precomposed form, causing mismatch.  This
    has been fixed by taking the same approach used to normalize the
    command line arguments.
    (merge 5c327502db tb/precompose-prefix-too later to maint).
  * Even though invocations of "die()" were logged to the trace2
    system, "BUG()"s were not, which has been corrected.
    (merge 0a9dde4a04 jt/trace2-BUG later to maint).
  * "git grep --untracked" is meant to be "let's ALSO find in these
    files on the filesystem" when looking for matches in the working
    tree files, and does not make any sense if the primary search is
    done against the index, or the tree objects.  The "--cached" and
    "--untracked" options have been marked as mutually incompatible.
    (merge 0c5d83b248 mt/grep-cached-untracked later to maint).
  * Fix "git fsck --name-objects" which apparently has not been used by
    anybody who is motivated enough to report breakage.
    (merge e89f89361c js/fsck-name-objects-fix later to maint).
  * Avoid individual tests in t5411 from getting affected by each other
    by forcing them to use separate output files during the test.
    (merge 822ee894f6 jx/t5411-unique-filenames later to maint).
  * Test to make sure "git rev-parse one-thing one-thing" gives
    the same thing twice (when one-thing is --since=X).
    (merge a5cdca4520 ew/rev-parse-since-test later to maint).
  * When certain features (e.g. grafts) used in the repository are
    incompatible with the use of the commit-graph, we used to silently
    turned commit-graph off; we now tell the user what we are doing.
    (merge c85eec7fc3 js/commit-graph-warning later to maint).
  * Objects that lost references can be pruned away, even when they
    have notes attached to it (and these notes will become dangling,
    which in turn can be pruned with "git notes prune").  This has been
    clarified in the documentation.
    (merge fa9ab027ba mz/doc-notes-are-not-anchors later to maint).
  * The error codepath around the "--temp/--prefix" feature of "git
    checkout-index" has been improved.
    (merge 3f7ba60350 mt/checkout-index-corner-cases later to maint).
  * The "git maintenance register" command had trouble registering bare
    repositories, which had been corrected.
  * A handful of multi-word configuration variable names in
    documentation that are spelled in all lowercase have been corrected
    to use the more canonical camelCase.
    (merge 7dd0eaa39c dl/doc-config-camelcase later to maint).
  * "git push $there --delete ''" should have been diagnosed as an
    error, but instead turned into a matching push, which has been
    corrected.
    (merge 20e416409f jc/push-delete-nothing later to maint).
  * Test script modernization.
    (merge 488acf15df sv/t7001-modernize later to maint).
  * An under-allocation for the untracked cache data has been corrected.
    (merge 6347d649bc jh/untracked-cache-fix later to maint).
  * Other code cleanup, docfix, build fix, etc.
    (merge e3f5da7e60 sg/t7800-difftool-robustify later to maint).
    (merge 9d336655ba js/doc-proto-v2-response-end later to maint).
    (merge 1b5b8cf072 jc/maint-column-doc-typofix later to maint).
    (merge 3a837b58e3 cw/pack-config-doc later to maint).
    (merge 01168a9d89 ug/doc-commit-approxidate later to maint).
    (merge b865734760 js/params-vs-args later to maint).

27

Documentation/RelNotes/2.31.1.txt Normal file

View File

 @ -0,0 +1,27 @@
 Git 2.31.1 Release Notes
 ========================
 Fixes since v2.31
 -----------------
  * The fsmonitor interface read from its input without making sure
    there is something to read from.  This bug is new in 2.31
    timeframe.
  * The data structure used by fsmonitor interface was not properly
    duplicated during an in-core merge, leading to use-after-free etc.
  * "git bisect" reimplemented more in C during 2.30 timeframe did not
    take an annotated tag as a good/bad endpoint well.  This regression
    has been corrected.
  * Fix macros that can silently inject unintended null-statements.
  * CALLOC_ARRAY() macro replaces many uses of xcalloc().
  * Update insn in Makefile comments to run fuzz-all target.
  * Fix a corner case bug in "git mv" on case insensitive systems,
    which was introduced in 2.29 timeframe.
 Also contains various documentation updates and code clean-ups.

397

Documentation/RelNotes/2.32.0.txt Normal file

View File

 @ -0,0 +1,397 @@
 Git 2.32 Release Notes
 ======================
 Backward compatibility notes
 ----------------------------
  * ".gitattributes", ".gitignore", and ".mailmap" files that are
    symbolic links are ignored.
  * "git apply --3way" used to first attempt a straight application,
    and only fell back to the 3-way merge algorithm when the stright
    application failed.  Starting with this version, the command will
    first try the 3-way merge algorithm and only when it fails (either
    resulting with conflict or the base versions of blobs are missing),
    falls back to the usual patch application.
 Updates since v2.31
 -------------------
 UI, Workflows & Features
  * It does not make sense to make ".gitattributes", ".gitignore" and
    ".mailmap" symlinks, as they are supposed to be usable from the
    object store (think: bare repositories where HEAD:.mailmap etc. are
    used).  When these files are symbolic links, we used to read the
    contents of the files pointed by them by mistake, which has been
    corrected.
  * "git stash show" learned to optionally show untracked part of the
    stash.
  * "git log --format='...'" learned "%(describe)" placeholder.
  * "git repack" so far has been only capable of repacking everything
    under the sun into a single pack (or split by size).  A cleverer
    strategy to reduce the cost of repacking a repository has been
    introduced.
  * The http codepath learned to let the credential layer to cache the
    password used to unlock a certificate that has successfully been
    used.
  * "git commit --fixup=<commit>", which was to tweak the changes made
    to the contents while keeping the original log message intact,
    learned "--fixup=(amend|reword):<commit>", that can be used to
    tweak both the message and the contents, and only the message,
    respectively.
  * When accessing a server with a URL like https://user:pass@site/, we
    did not to fall back to the basic authentication with the
    credential material embedded in the URL after the "Negotiate"
    authentication failed.  Now we do.
  * "git send-email" learned to honor the core.hooksPath configuration.
  * "git format-patch -v<n>" learned to allow a reroll count that is
    not an integer.
  * "git commit" learned "--trailer <key>[=<value>]" option; together
    with the interpret-trailers command, this will make it easier to
    support custom trailers.
  * "git clone --reject-shallow" option fails the clone as soon as we
    notice that we are cloning from a shallow repository.
  * A configuration variable has been added to force tips of certain
    refs to be given a reachability bitmap.
  * "gitweb" learned "e-mail privacy" feature to redact strings that
    look like e-mail addresses on various pages.
  * "git apply --3way" has always been "to fall back to 3-way merge
    only when straight application fails". Swap the order of falling
    back so that 3-way is always attempted first (only when the option
    is given, of course) and then straight patch application is used as
    a fallback when it fails.
  * "git apply" now takes "--3way" and "--cached" at the same time, and
    work and record results only in the index.
  * The command line completion (in contrib/) has learned that
    CHERRY_PICK_HEAD is a possible pseudo-ref.
  * Userdiff patterns for "Scheme" has been added.
  * "git log" learned "--diff-merges=<style>" option, with an
    associated configuration variable log.diffMerges.
  * "git log --format=..." placeholders learned %ah/%ch placeholders to
    request the --date=human output.
  * Replace GIT_CONFIG_NOSYSTEM mechanism to decline from reading the
    system-wide configuration file with GIT_CONFIG_SYSTEM that lets
    users specify from which file to read the system-wide configuration
    (setting it to an empty file would essentially be the same as
    setting NOSYSTEM), and introduce GIT_CONFIG_GLOBAL to override the
    per-user configuration in $HOME/.gitconfig.
  * "git add" and "git rm" learned not to touch those paths that are
    outside of sparse checkout.
  * "git rev-list" learns the "--filter=object:type=<type>" option,
    which can be used to exclude objects of the given kind from the
    packfile generated by pack-objects.
  * The command line completion (in contrib/) for "git stash" has been
    updated.
  * "git subtree" updates.
  * It is now documented that "format-patch" skips merges.
  * Options to "git pack-objects" that take numeric values like
    --window and --depth should not accept negative values; the input
    validation has been tightened.
  * The way the command line specified by the trailer.<token>.command
    configuration variable receives the end-user supplied value was
    both error prone and misleading.  An alternative to achieve the
    same goal in a safer and more intuitive way has been added, as
    the trailer.<token>.cmd configuration variable, to replace it.
  * "git add -i --dry-run" does not dry-run, which was surprising.  The
    combination of options has taught to error out.
  * "git push" learns to discover common ancestor with the receiving
    end over protocol v2.  This will hopefully make "git push" as
    efficient as "git fetch" in avoiding objects from getting
    transferred unnecessarily.
  * "git mailinfo" (hence "git am") learned the "--quoted-cr" option to
    control how lines ending with CRLF wrapped in base64 or qp are
    handled.
 Performance, Internal Implementation, Development Support etc.
  * Rename detection rework continues.
  * GIT_TEST_FAIL_PREREQS is a mechanism to skip test pieces with
    prerequisites to catch broken tests that depend on the side effects
    of optional pieces, but did not work at all when negative
    prerequisites were involved.
    (merge 27d578d904 jk/fail-prereq-testfix later to maint).
  * "git diff-index" codepath has been taught to trust fsmonitor status
    to reduce number of lstat() calls.
    (merge 7e5aa13d2c nk/diff-index-fsmonitor later to maint).
  * Reorganize Makefile to allow building git.o and other essential
    objects without extra stuff needed only for testing.
  * Preparatory API changes for parallel checkout.
  * A simple IPC interface gets introduced to build services like
    fsmonitor on top.
  * Fsck API clean-up.
  * SECURITY.md that is facing individual contributors and end users
    has been introduced.  Also a procedure to follow when preparing
    embargoed releases has been spelled out.
    (merge 09420b7648 js/security-md later to maint).
  * Optimize "rev-list --use-bitmap-index --objects" corner case that
    uses negative tags as the stopping points.
  * CMake update for vsbuild.
  * An on-disk reverse-index to map the in-pack location of an object
    back to its object name across multiple packfiles is introduced.
  * Generate [ec]tags under $(QUIET_GEN).
  * Clean-up codepaths that implements "git send-email --validate"
    option and improves the message from it.
  * The last remnant of gettext-poison has been removed.
  * The test framework has been taught to optionally turn the default
    merge strategy to "ort" throughout the system where we use
    three-way merges internally, like cherry-pick, rebase etc.,
    primarily to enhance its test coverage (the strategy has been
    available as an explicit "-s ort" choice).
  * A bit of code clean-up and a lot of test clean-up around userdiff
    area.
  * Handling of "promisor packs" that allows certain objects to be
    missing and lazily retrievable has been optimized (a bit).
  * When packet_write() fails, we gave an extra error message
    unnecessarily, which has been corrected.
  * The checkout machinery has been taught to perform the actual
    write-out of the files in parallel when able.
  * Show errno in the trace output in the error codepath that calls
    read_raw_ref method.
  * Effort to make the command line completion (in contrib/) safe with
    "set -u" continues.
  * Tweak a few tests for "log --format=..." that show timestamps in
    various formats.
  * The reflog expiry machinery has been taught to emit trace events.
  * Over-the-wire protocol learns a new request type to ask for object
    sizes given a list of object names.
 Fixes since v2.31
 -----------------
  * The fsmonitor interface read from its input without making sure
    there is something to read from.  This bug is new in 2.31
    timeframe.
  * The data structure used by fsmonitor interface was not properly
    duplicated during an in-core merge, leading to use-after-free etc.
  * "git bisect" reimplemented more in C during 2.30 timeframe did not
    take an annotated tag as a good/bad endpoint well.  This regression
    has been corrected.
  * Fix macros that can silently inject unintended null-statements.
  * CALLOC_ARRAY() macro replaces many uses of xcalloc().
  * Update insn in Makefile comments to run fuzz-all target.
  * Fix a corner case bug in "git mv" on case insensitive systems,
    which was introduced in 2.29 timeframe.
  * We had a code to diagnose and die cleanly when a required
    clean/smudge filter is missing, but an assert before that
    unnecessarily fired, hiding the end-user facing die() message.
    (merge 6fab35f748 mt/cleanly-die-upon-missing-required-filter later to maint).
  * Update C code that sets a few configuration variables when a remote
    is configured so that it spells configuration variable names in the
    canonical camelCase.
    (merge 0f1da600e6 ab/remote-write-config-in-camel-case later to maint).
  * A new configuration variable has been introduced to allow choosing
    which version of the generation number gets used in the
    commit-graph file.
    (merge 702110aac6 ds/commit-graph-generation-config later to maint).
  * Perf test update to work better in secondary worktrees.
    (merge 36e834abc1 jk/perf-in-worktrees later to maint).
  * Updates to memory allocation code around the use of pcre2 library.
    (merge c1760352e0 ab/grep-pcre2-allocfix later to maint).
  * "git -c core.bare=false clone --bare ..." would have segfaulted,
    which has been corrected.
    (merge 75555676ad bc/clone-bare-with-conflicting-config later to maint).
  * When "git checkout" removes a path that does not exist in the
    commit it is checking out, it wasn't careful enough not to follow
    symbolic links, which has been corrected.
    (merge fab78a0c3d mt/checkout-remove-nofollow later to maint).
  * A few option description strings started with capital letters,
    which were corrected.
    (merge 5ee90326dc cc/downcase-opt-help later to maint).
  * Plug or annotate remaining leaks that trigger while running the
    very basic set of tests.
    (merge 68ffe095a2 ah/plugleaks later to maint).
  * The hashwrite() API uses a buffering mechanism to avoid calling
    write(2) too frequently. This logic has been refactored to be
    easier to understand.
    (merge ddaf1f62e3 ds/clarify-hashwrite later to maint).
  * "git cherry-pick/revert" with or without "--[no-]edit" did not spawn
    the editor as expected (e.g. "revert --no-edit" after a conflict
    still asked to edit the message), which has been corrected.
    (merge 39edfd5cbc en/sequencer-edit-upon-conflict-fix later to maint).
  * "git daemon" has been tightened against systems that take backslash
    as directory separator.
    (merge 9a7f1ce8b7 rs/daemon-sanitize-dir-sep later to maint).
  * A NULL-dereference bug has been corrected in an error codepath in
    "git for-each-ref", "git branch --list" etc.
    (merge c685450880 jk/ref-filter-segfault-fix later to maint).
  * Streamline the codepath to fix the UTF-8 encoding issues in the
    argv[] and the prefix on macOS.
    (merge c7d0e61016 tb/precompose-prefix-simplify later to maint).
  * The command-line completion script (in contrib/) had a couple of
    references that would have given a warning under the "-u" (nounset)
    option.
    (merge c5c0548d79 vs/completion-with-set-u later to maint).
  * When "git pack-objects" makes a literal copy of a part of existing
    packfile using the reachability bitmaps, its update to the progress
    meter was broken.
    (merge 8e118e8490 jk/pack-objects-bitmap-progress-fix later to maint).
  * The dependencies for config-list.h and command-list.h were broken
    when the former was split out of the latter, which has been
    corrected.
    (merge 56550ea718 sg/bugreport-fixes later to maint).
  * "git push --quiet --set-upstream" was not quiet when setting the
    upstream branch configuration, which has been corrected.
    (merge f3cce896a8 ow/push-quiet-set-upstream later to maint).
  * The prefetch task in "git maintenance" assumed that "git fetch"
    from any remote would fetch all its local branches, which would
    fetch too much if the user is interested in only a subset of
    branches there.
    (merge 32f67888d8 ds/maintenance-prefetch-fix later to maint).
  * Clarify that pathnames recorded in Git trees are most often (but
    not necessarily) encoded in UTF-8.
    (merge 9364bf465d ab/pathname-encoding-doc later to maint).
  * "git --config-env var=val cmd" weren't accepted (only
    --config-env=var=val was).
    (merge c331551ccf ps/config-env-option-with-separate-value later to maint).
  * When the reachability bitmap is in effect, the "do not lose
    recently created objects and those that are reachable from them"
    safety to protect us from races were disabled by mistake, which has
    been corrected.
    (merge 2ba582ba4c jk/prune-with-bitmap-fix later to maint).
  * Cygwin pathname handling fix.
    (merge bccc37fdc7 ad/cygwin-no-backslashes-in-paths later to maint).
  * "git rebase --[no-]reschedule-failed-exec" did not work well with
    its configuration variable, which has been corrected.
    (merge e5b32bffd1 ab/rebase-no-reschedule-failed-exec later to maint).
  * Portability fix for command line completion script (in contrib/).
    (merge f2acf763e2 si/zsh-complete-comment-fix later to maint).
  * "git repack -A -d" in a partial clone unnecessarily loosened
    objects in promisor pack.
  * "git bisect skip" when custom words are used for new/old did not
    work, which has been corrected.
  * A few variants of informational message "Already up-to-date" has
    been rephrased.
    (merge ad9322da03 js/merge-already-up-to-date-message-reword later to maint).
  * "git submodule update --quiet" did not propagate the quiet option
    down to underlying "git fetch", which has been corrected.
    (merge 62af4bdd42 nc/submodule-update-quiet later to maint).
  * Document that our test can use "local" keyword.
    (merge a84fd3bcc6 jc/test-allows-local later to maint).
  * The word-diff mode has been taught to work better with a word
    regexp that can match an empty string.
    (merge 0324e8fc6b pw/word-diff-zero-width-matches later to maint).
  * "git p4" learned to find branch points more efficiently.
    (merge 6b79818bfb jk/p4-locate-branch-point-optim later to maint).
  * When "git update-ref -d" removes a ref that is packed, it left
    empty directories under $GIT_DIR/refs/ for
    (merge 5f03e5126d wc/packed-ref-removal-cleanup later to maint).
  * Other code cleanup, docfix, build fix, etc.
    (merge f451960708 dl/cat-file-doc-cleanup later to maint).
    (merge 12604a8d0c sv/t9801-test-path-is-file-cleanup later to maint).
    (merge ea7e63921c jr/doc-ignore-typofix later to maint).
    (merge 23c781f173 ps/update-ref-trans-hook-doc later to maint).
    (merge 42efa1231a jk/filter-branch-sha256 later to maint).
    (merge 4c8e3dca6e tb/push-simple-uses-branch-merge-config later to maint).
    (merge 6534d436a2 bs/asciidoctor-installation-hints later to maint).
    (merge 47957485b3 ab/read-tree later to maint).
    (merge 2be927f3d1 ab/diff-no-index-tests later to maint).
    (merge 76593c09bb ab/detox-gettext-tests later to maint).
    (merge 28e29ee38b jc/doc-format-patch-clarify later to maint).
    (merge fc12b6fdde fm/user-manual-use-preface later to maint).
    (merge dba94e3a85 cc/test-helper-bloom-usage-fix later to maint).
    (merge 61a7660516 hn/reftable-tables-doc-update later to maint).
    (merge 81ed96a9b2 jt/fetch-pack-request-fix later to maint).
    (merge 151b6c2dd7 jc/doc-do-not-capitalize-clarification later to maint).
    (merge 9160068ac6 js/access-nul-emulation-on-windows later to maint).
    (merge 7a14acdbe6 po/diff-patch-doc later to maint).
    (merge f91371b948 pw/patience-diff-clean-up later to maint).
    (merge 3a7f0908b6 mt/clean-clean later to maint).
    (merge d4e2d15a8b ab/streaming-simplify later to maint).
    (merge 0e59f7ad67 ah/merge-ort-i18n later to maint).
    (merge e6f68f62e0 ls/typofix later to maint).

11

Documentation/SubmittingPatches

View File

 @ -117,10 +117,13 @@ If in doubt which identifier to use, run `git log --no-merges` on the
 files you are modifying to see the current conventions.
 [[summary-section]]
 It's customary to start the remainder of the first line after "area: "
 with a lower-case letter. E.g. "doc: clarify...", not "doc:
 Clarify...", or "githooks.txt: improve...", not "githooks.txt:
 Improve...".
 The title sentence after the "area:" prefix omits the full stop at the
 end, and its first word is not capitalized unless there is a reason to
 capitalize it other than because it is the first word in the sentence.
 E.g. "doc: clarify...", not "doc: Clarify...", or "githooks.txt:
 improve...", not "githooks.txt: Improve...".  But "refs: HEAD is also
 treated as a ref" is correct, as we spell `HEAD` in all caps even when
 it appears in the middle of a sentence.
 [[meaningful-message]]
 The body should provide a meaningful commit message, which:

2

Documentation/blame-options.txt

View File

 @ -1,6 +1,6 @@
 -b::
 	Show blank SHA-1 for boundary commits.  This can also
 	be controlled via the `blame.blankboundary` config option.
 	be controlled via the `blame.blankBoundary` config option.
 --root::
 	Do not treat root commits as boundaries.  This can also be

6

Documentation/config.txt

View File

 @ -46,7 +46,7 @@ Subsection names are case sensitive and can contain any characters except
 newline and the null byte. Doublequote `"` and backslash can be included
 by escaping them as `\"` and `\\`, respectively. Backslashes preceding
 other characters are dropped when reading; for example, `\t` is read as
 `t` and `\0` is read as `0` Section headers cannot span multiple lines.
 `t` and `\0` is read as `0`. Section headers cannot span multiple lines.
 Variables may belong directly to a section or to a given subsection. You
 can have `[section]` if you have `[section "subsection"]`, but you don't
 need to.
 @ -398,6 +398,8 @@ include::config/interactive.txt[]
 include::config/log.txt[]
 include::config/lsrefs.txt[]
 include::config/mailinfo.txt[]
 include::config/mailmap.txt[]
 @ -438,8 +440,6 @@ include::config/rerere.txt[]
 include::config/reset.txt[]
 include::config/safe.txt[]
 include::config/sendemail.txt[]
 include::config/sequencer.txt[]

4

Documentation/config/advice.txt

View File

 @ -119,4 +119,8 @@ advice.*::
 	addEmptyPathspec::
 		Advice shown if a user runs the add command without providing
 		the pathspec parameter.
 	updateSparsePath::
 		Advice shown when either linkgit:git-add[1] or linkgit:git-rm[1]
 		is asked to update index entries outside the current sparse
 		checkout.
 --

21

Documentation/config/checkout.txt

View File

 @ -21,3 +21,24 @@ checkout.guess::
 	Provides the default value for the `--guess` or `--no-guess`
 	option in `git checkout` and `git switch`. See
 	linkgit:git-switch[1] and linkgit:git-checkout[1].
 checkout.workers::
 	The number of parallel workers to use when updating the working tree.
 	The default is one, i.e. sequential execution. If set to a value less
 	than one, Git will use as many workers as the number of logical cores
 	available. This setting and `checkout.thresholdForParallelism` affect
 	all commands that perform checkout. E.g. checkout, clone, reset,
 	sparse-checkout, etc.
 +
 Note: parallel checkout usually delivers better performance for repositories
 located on SSDs or over NFS. For repositories on spinning disks and/or machines
 with a small number of cores, the default sequential checkout often performs
 better. The size and compression level of a repository might also influence how
 well the parallel version performs.
 checkout.thresholdForParallelism::
 	When running parallel checkout with a small number of files, the cost
 	of subprocess spawning and inter-process communication might outweigh
 	the parallelization gains. This setting allows to define the minimum
 	number of files for which parallel checkout should be attempted. The
 	default is 100.

4

Documentation/config/clone.txt

View File

 @ -2,3 +2,7 @@ clone.defaultRemoteName::
 	The name of the remote to create when cloning a repository.  Defaults to
 	`origin`, and can be overridden by passing the `--origin` command-line
 	option to linkgit:git-clone[1].
 clone.rejectShallow::
 	Reject to clone a repository if it is a shallow one, can be overridden by
 	passing option `--reject-shallow` in command line. See linkgit:git-clone[1]

6

Documentation/config/commitgraph.txt

View File

 @ -1,3 +1,9 @@
 commitGraph.generationVersion::
 	Specifies the type of generation number version to use when writing
 	or reading the commit-graph file. If version 1 is specified, then
 	the corrected commit dates will not be written or read. Defaults to
 .
 commitGraph.maxNewFilters::
 	Specifies the default value for the `--max-new-filters` option of `git
 	commit-graph write` (c.f., linkgit:git-commit-graph[1]).

2

Documentation/config/core.txt

View File

 @ -625,4 +625,6 @@ core.abbrev::
 	computed based on the approximate number of packed objects
 	in your repository, which hopefully is enough for
 	abbreviated object names to stay unique for some time.
 	If set to "no", no abbreviation is made and the object names
 	are shown in their full length.
 	The minimum length is 4.

2

Documentation/config/diff.txt

View File

 @ -85,6 +85,8 @@ diff.ignoreSubmodules::
 	and 'git status' when `status.submoduleSummary` is set unless it is
 	overridden by using the --ignore-submodules command-line option.
 	The 'git submodule' commands are not affected by this setting.
 	By default this is set to untracked so that any untracked
 	submodules are ignored.
 diff.mnemonicPrefix::
 	If set, 'git diff' uses a prefix pair that is different from the

5

Documentation/config/index.txt

View File

 @ -14,6 +14,11 @@ index.recordOffsetTable::
 	Defaults to 'true' if index.threads has been explicitly enabled,
 	'false' otherwise.
 index.sparse::
 	When enabled, write the index using sparse-directory entries. This
 	has no effect unless `core.sparseCheckout` and
 	`core.sparseCheckoutCone` are both enabled. Defaults to 'false'.
 index.threads::
 	Specifies the number of threads to spawn when loading the index.
 	This is meant to reduce index load time on multiprocessor machines.

2

Documentation/config/init.txt

View File

 @ -4,4 +4,4 @@ init.templateDir::
 init.defaultBranch::
 	Allows overriding the default branch name e.g. when initializing
 	a new repository or when cloning an empty repository.
 	a new repository.

5

Documentation/config/log.txt

View File

 @ -24,6 +24,11 @@ log.excludeDecoration::
 	the config option can be overridden by the `--decorate-refs`
 	option.
 log.diffMerges::
 	Set default diff format to be used for merge commits. See
 	`--diff-merges` in linkgit:git-log[1] for details.
 	Defaults to `separate`.
 log.follow::
 	If `true`, `git log` will act as if the `--follow` option was used when
 	a single <path> is given.  This has the same limitations as `--follow`,

9

Documentation/config/lsrefs.txt Normal file

View File

 @ -0,0 +1,9 @@
 lsrefs.unborn::
 	May be "advertise" (the default), "allow", or "ignore". If "advertise",
 	the server will respond to the client sending "unborn" (as described in
 	protocol-v2.txt) and will advertise support for this feature during the
 	protocol v2 capability advertisement. "allow" is the same as
 	"advertise" except that the server will not advertise support for this
 	feature; this is useful for load-balanced servers that cannot be
 	updated atomically (for example), since the administrator could
 	configure "allow", then after a delay, configure "advertise".

5

Documentation/config/maintenance.txt

View File

 @ -15,8 +15,9 @@ maintenance.strategy::
 * `none`: This default setting implies no task are run at any schedule.
 * `incremental`: This setting optimizes for performing small maintenance
   activities that do not delete any data. This does not schedule the `gc`
   task, but runs the `prefetch` and `commit-graph` tasks hourly and the
   `loose-objects` and `incremental-repack` tasks daily.
   task, but runs the `prefetch` and `commit-graph` tasks hourly, the
   `loose-objects` and `incremental-repack` tasks daily, and the `pack-refs`
   task weekly.
 maintenance.<task>.enabled::
 	This boolean config option controls whether the maintenance task

15

Documentation/config/mergetool.txt

View File

 @ -13,6 +13,11 @@ mergetool.<tool>.cmd::
 	merged; 'MERGED' contains the name of the file to which the merge
 	tool should write the results of a successful merge.
 mergetool.<tool>.hideResolved::
 	Allows the user to override the global `mergetool.hideResolved` value
 	for a specific tool. See `mergetool.hideResolved` for the full
 	description.
 mergetool.<tool>.trustExitCode::
 	For a custom merge command, specify whether the exit code of
 	the merge command can be used to determine whether the merge was
 @ -40,6 +45,16 @@ mergetool.meld.useAutoMerge::
 	value of `false` avoids using `--auto-merge` altogether, and is the
 	default value.
 mergetool.hideResolved::
 	During a merge Git will automatically resolve as many conflicts as
 	possible and write the 'MERGED' file containing conflict markers around
 	any conflicts that it cannot resolve; 'LOCAL' and 'REMOTE' normally
 	represent the versions of the file from before Git's conflict
 	resolution. This flag causes 'LOCAL' and 'REMOTE' to be overwriten so
 	that only the unresolved conflicts are presented to the merge tool. Can
 	be configured per-tool via the `mergetool.<tool>.hideResolved`
 	configuration variable. Defaults to `false`.
 mergetool.keepBackup::
 	After performing a merge, the original file with conflict markers
 	can be saved as a file with a `.orig` extension.  If this variable

22

Documentation/config/pack.txt

View File

 @ -122,6 +122,21 @@ pack.useSparse::
 	commits contain certain types of direct renames. Default is
 	`true`.
 pack.preferBitmapTips::
 	When selecting which commits will receive bitmaps, prefer a
 	commit at the tip of any reference that is a suffix of any value
 	of this configuration over any other commits in the "selection
 	window".
 +
 Note that setting this configuration to `refs/foo` does not mean that
 the commits at the tips of `refs/foo/bar` and `refs/foo/baz` will
 necessarily be selected. This is because commits are selected for
 bitmaps from within a series of windows of variable length.
 +
 If a commit at the tip of any reference which is a suffix of any value
 of this configuration is seen in a window, it is immediately given
 preference over any other commit in that window.
 pack.writeBitmaps (deprecated)::
 	This is a deprecated synonym for `repack.writeBitmaps`.
 @ -133,3 +148,10 @@ pack.writeBitmapHashCache::
 	between an older, bitmapped pack and objects that have been
 	pushed since the last gc). The downside is that it consumes 4
 	bytes per object of disk space. Defaults to true.
 pack.writeReverseIndex::
 	When true, git will write a corresponding .rev file (see:
 	link:../technical/pack-format.html[Documentation/technical/pack-format.txt])
 	for each new packfile that it writes in all places except for
 	linkgit:git-fast-import[1] and in the bulk checkin mechanism.
 	Defaults to false.

6

Documentation/config/protocol.txt

View File

 @ -1,10 +1,10 @@
 protocol.allow::
 	If set, provide a user defined default policy for all protocols which
 	don't explicitly have a policy (`protocol.<name>.allow`).  By default,
 	if unset, known-safe protocols (http, https, git, ssh) have a
 	if unset, known-safe protocols (http, https, git, ssh, file) have a
 	default policy of `always`, known-dangerous protocols (ext) have a
 	default policy of `never`, and all other protocols (including file)
 	have a default policy of `user`.  Supported policies:
 	default policy of `never`, and all other protocols have a default
 	policy of `user`.  Supported policies:
 +
 --

7

Documentation/config/push.txt

View File

 @ -120,3 +120,10 @@ push.useForceIfIncludes::
 	`--force-if-includes` as an option to linkgit:git-push[1]
 	in the command line. Adding `--no-force-if-includes` at the
 	time of push overrides this configuration setting.
 push.negotiate::
 	If set to "true", attempt to reduce the size of the packfile
 	sent by rounds of negotiation in which the client and the
 	server attempt to find commits in common. If "false", Git will
 	rely solely on the server's ref advertisement to find commits
 	in common.

10

Documentation/config/rebase.txt

View File

 @ -1,10 +1,3 @@
 rebase.useBuiltin::
 	Unused configuration variable. Used in Git versions 2.20 and
 .21 as an escape hatch to enable the legacy shellscript
 	implementation of rebase. Now the built-in rewrite of it in C
 	is always used. Setting this will emit a warning, to alert any
 	remaining users that setting this now does nothing.
 rebase.backend::
 	Default backend to use for rebasing.  Possible choices are
 	'apply' or 'merge'.  In the future, if the merge backend gains
 @ -68,3 +61,6 @@ rebase.rescheduleFailedExec::
 	Automatically reschedule `exec` commands that failed. This only makes
 	sense in interactive mode (or when an `--exec` option was provided).
 	This is the same as specifying the `--reschedule-failed-exec` option.
 rebase.forkPoint::
 	If set to false set `--no-fork-point` option by default.

42

Documentation/config/safe.txt

View File

 @ -1,42 +0,0 @@
 safe.directory::
 	These config entries specify Git-tracked directories that are
 	considered safe even if they are owned by someone other than the
 	current user. By default, Git will refuse to even parse a Git
 	config of a repository owned by someone else, let alone run its
 	hooks, and this config setting allows users to specify exceptions,
 	e.g. for intentionally shared repositories (see the `--shared`
 	option in linkgit:git-init[1]).
 +
 This is a multi-valued setting, i.e. you can add more than one directory
 via `git config --add`. To reset the list of safe directories (e.g. to
 override any such directories specified in the system config), add a
 `safe.directory` entry with an empty value.
 +
 This config setting is only respected when specified in a system or global
 config, not when it is specified in a repository config or via the command
 line option `-c safe.directory=<path>`.
 +
 The value of this setting is interpolated, i.e. `~/<path>` expands to a
 path relative to the home directory and `%(prefix)/<path>` expands to a
 path relative to Git's (runtime) prefix.
 +
 To completely opt-out of this security check, set `safe.directory` to the
 string `*`. This will allow all repositories to be treated as if their
 directory was listed in the `safe.directory` list. If `safe.directory=*`
 is set in system config and you want to re-enable this protection, then
 initialize your list with an empty value before listing the repositories
 that you deem safe.
 +
 As explained, Git only allows you to access repositories owned by
 yourself, i.e. the user who is running Git, by default.  When Git
 is running as 'root' in a non Windows platform that provides sudo,
 however, git checks the SUDO_UID environment variable that sudo creates
 and will allow access to the uid recorded as its value in addition to
 the id from 'root'.
 This is to make it easy to perform a common sequence during installation
 "make && sudo make install".  A git process running under 'sudo' runs as
 'root' but the 'sudo' command exports the environment variable to record
 which id the original user has.
 If that is not what you would prefer and want git to only trust
 repositories that are owned by root instead, then you can remove
 the `SUDO_UID` variable from root's environment before invoking git.

5

Documentation/config/stash.txt

View File

 @ -5,6 +5,11 @@ stash.useBuiltin::
 	is always used. Setting this will emit a warning, to alert any
 	remaining users that setting this now does nothing.
 stash.showIncludeUntracked::
 	If this is set to true, the `git stash show` command without an
 	option will show the untracked files of a stash entry.  Defaults to
 	false. See description of 'show' command in linkgit:git-stash[1].
 stash.showPatch::
 	If this is set to true, the `git stash show` command without an
 	option will show the stash entry in patch form.  Defaults to false.

9

Documentation/config/uploadpack.txt

View File

 @ -59,15 +59,16 @@ uploadpack.allowFilter::
 uploadpackfilter.allow::
 	Provides a default value for unspecified object filters (see: the
 	below configuration variable).
 	below configuration variable). If set to `true`, this will also
 	enable all filters which get added in the future.
 	Defaults to `true`.
 uploadpackfilter.<filter>.allow::
 	Explicitly allow or ban the object filter corresponding to
 	`<filter>`, where `<filter>` may be one of: `blob:none`,
 	`blob:limit`, `tree`, `sparse:oid`, or `combine`. If using
 	combined filters, both `combine` and all of the nested filter
 	kinds must be allowed. Defaults to `uploadpackfilter.allow`.
 	`blob:limit`, `object:type`, `tree`, `sparse:oid`, or `combine`.
 	If using combined filters, both `combine` and all of the nested
 	filter kinds must be allowed. Defaults to `uploadpackfilter.allow`.
 uploadpackfilter.tree.maxDepth::
 	Only allow `--filter=tree:<n>` when `<n>` is no more than the value of

11

Documentation/date-formats.txt

View File

 @ -1,10 +1,7 @@
 DATE FORMATS
 ------------
 The `GIT_AUTHOR_DATE`, `GIT_COMMITTER_DATE` environment variables
 ifdef::git-commit[]
 and the `--date` option
 endif::git-commit[]
 The `GIT_AUTHOR_DATE` and `GIT_COMMITTER_DATE` environment variables
 support the following date formats:
 Git internal format::
 @ -26,3 +23,9 @@ ISO 8601::
 +
 NOTE: In addition, the date part is accepted in the following formats:
 `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
 ifdef::git-commit[]
 In addition to recognizing all date formats above, the `--date` option
 will also try to make sense of other, more human-centric date formats,
 such as relative dates like "yesterday" or "last Friday at noon".
 endif::git-commit[]

13

Documentation/diff-generate-patch.txt

View File

 @ -11,7 +11,7 @@ linkgit:git-diff-files[1]
 with the `-p` option produces patch text.
 You can customize the creation of patch text via the
 `GIT_EXTERNAL_DIFF` and the `GIT_DIFF_OPTS` environment variables
 (see linkgit:git[1]).
 (see linkgit:git[1]), and the `diff` attribute (see linkgit:gitattributes[5]).
 What the -p option produces is slightly different from the traditional
 diff format:
 @ -74,6 +74,11 @@ separate lines indicate the old and the new mode.
       rename from b
       rename to a
 .  Hunk headers mention the name of the function to which the hunk
     applies.  See "Defining a custom hunk-header" in
     linkgit:gitattributes[5] for details of how to tailor to this to
     specific languages.
 Combined diff format
 --------------------
 @ -81,9 +86,9 @@ Combined diff format
 Any diff-generating command can take the `-c` or `--cc` option to
 produce a 'combined diff' when showing a merge. This is the default
 format when showing merges with linkgit:git-diff[1] or
 linkgit:git-show[1]. Note also that you can give the `-m` option to any
 of these commands to force generation of diffs with individual parents
 of a merge.
 linkgit:git-show[1]. Note also that you can give suitable
 `--diff-merges` option to any of these commands to force generation of
 diffs in specific format.
 A "combined diff" format looks like this:

71

Documentation/diff-options.txt

View File

 @ -33,6 +33,64 @@ endif::git-diff[]
 	show the patch by default, or to cancel the effect of `--patch`.
 endif::git-format-patch[]
 ifdef::git-log[]
 --diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
 --no-diff-merges::
 	Specify diff format to be used for merge commits. Default is
 	{diff-merges-default} unless `--first-parent` is in use, in which case
 	`first-parent` is the default.
 +
 --diff-merges=(off|none):::
 --no-diff-merges:::
 	Disable output of diffs for merge commits. Useful to override
 	implied value.
 +
 --diff-merges=on:::
 --diff-merges=m:::
 -m:::
 	This option makes diff output for merge commits to be shown in
 	the default format. `-m` will produce the output only if `-p`
 	is given as well. The default format could be changed using
 	`log.diffMerges` configuration parameter, which default value
 	is `separate`.
 +
 --diff-merges=first-parent:::
 --diff-merges=1:::
 	This option makes merge commits show the full diff with
 	respect to the first parent only.
 +
 --diff-merges=separate:::
 	This makes merge commits show the full diff with respect to
 	each of the parents. Separate log entry and diff is generated
 	for each parent.
 +
 --diff-merges=combined:::
 --diff-merges=c:::
 -c:::
 	With this option, diff output for a merge commit shows the
 	differences from each of the parents to the merge result
 	simultaneously instead of showing pairwise diff between a
 	parent and the result one at a time. Furthermore, it lists
 	only files which were modified from all parents. `-c` implies
 	`-p`.
 +
 --diff-merges=dense-combined:::
 --diff-merges=cc:::
 --cc:::
 	With this option the output produced by
 	`--diff-merges=combined` is further compressed by omitting
 	uninteresting hunks whose contents in the parents have only
 	two variants and the merge result picks one of them without
 	modification.  `--cc` implies `-p`.
 --combined-all-paths::
 	This flag causes combined diffs (used for merge commits) to
 	list the name of the file from all parents.  It thus only has
 	effect when `--diff-merges=[dense-]combined` is in use, and
 	is likely only useful if filename changes are detected (i.e.
 	when either rename or copy detection have been requested).
 endif::git-log[]
 -U<n>::
 --unified=<n>::
 	Generate diffs with <n> lines of context instead of
 @ -242,11 +300,14 @@ explained for the configuration variable `core.quotePath` (see
 linkgit:git-config[1]).
 --name-only::
 	Show only names of changed files.
 	Show only names of changed files. The file names are often encoded in UTF-8.
 	For more information see the discussion about encoding in the linkgit:git-log[1]
 	manual page.
 --name-status::
 	Show only names and status of changed files. See the description
 	of the `--diff-filter` option on what the status letters mean.
 	Just like `--name-only` the file names are often encoded in UTF-8.
 --submodule[=<format>]::
 	Specify how differences in submodules are shown.  When specifying
 @ -649,6 +710,14 @@ matches a pattern if removing any number of the final pathname
 components matches the pattern.  For example, the pattern "`foo*bar`"
 matches "`fooasdfbar`" and "`foo/bar/baz/asdf`" but not "`foobarx`".
 --skip-to=<file>::
 --rotate-to=<file>::
 	Discard the files before the named <file> from the output
 	(i.e. 'skip to'), or move them to the end of the output
 	(i.e. 'rotate to').  These were invented primarily for use
 	of the `git difftool` command, and may not be very useful
 	otherwise.
 ifndef::git-format-patch[]
 -R::
 	Swap two inputs; that is, show differences from index or

9

Documentation/fetch-options.txt

View File

 @ -7,6 +7,10 @@
 	existing contents of `.git/FETCH_HEAD`.  Without this
 	option old data in `.git/FETCH_HEAD` will be overwritten.
 --atomic::
 	Use an atomic transaction to update local refs. Either all refs are
 	updated, or on error, no refs are updated.
 --depth=<depth>::
 	Limit fetching to the specified number of commits from the tip of
 	each remote branch history. If fetching to a 'shallow' repository
 @ -106,6 +110,11 @@ ifndef::git-pull[]
 	setting `fetch.writeCommitGraph`.
 endif::git-pull[]
 --prefetch::
 	Modify the configured refspec to place all refs into the
 	`refs/prefetch/` namespace. See the `prefetch` task in
 	linkgit:git-maintenance[1].
 -p::
 --prune::
 	Before fetching, remove any remote-tracking references that no

6

Documentation/git-am.txt

View File

 @ -15,6 +15,7 @@ SYNOPSIS
 	 [--whitespace=<option>] [-C<n>] [-p<n>] [--directory=<dir>]
 	 [--exclude=<path>] [--include=<path>] [--reject] [-q | --quiet]
 	 [--[no-]scissors] [-S[<keyid>]] [--patch-format=<format>]
 	 [--quoted-cr=<action>]
 	 [(<mbox> | <Maildir>)...]
 'git am' (--continue | --skip | --abort | --quit | --show-current-patch[=(diff|raw)])
 @ -59,6 +60,9 @@ OPTIONS
 --no-scissors::
 	Ignore scissors lines (see linkgit:git-mailinfo[1]).
 --quoted-cr=<action>::
 	This flag will be passed down to 'git mailinfo' (see linkgit:git-mailinfo[1]).
 -m::
 --message-id::
 	Pass the `-m` flag to 'git mailinfo' (see linkgit:git-mailinfo[1]),
 @ -79,7 +83,7 @@ OPTIONS
 	Pass `-u` flag to 'git mailinfo' (see linkgit:git-mailinfo[1]).
 	The proposed commit log message taken from the e-mail
 	is re-coded into UTF-8 encoding (configuration variable
 	`i18n.commitencoding` can be used to specify project's
 	`i18n.commitEncoding` can be used to specify project's
 	preferred encoding if it is not UTF-8).
 +
 This was optional in prior versions of git, but now it is the

11

Documentation/git-apply.txt

View File

 @ -84,12 +84,13 @@ OPTIONS
 -3::
 --3way::
 	When the patch does not apply cleanly, fall back on 3-way merge if
 	the patch records the identity of blobs it is supposed to apply to,
 	and we have those blobs available locally, possibly leaving the
 	Attempt 3-way merge if the patch records the identity of blobs it is supposed
 	to apply to and we have those blobs available locally, possibly leaving the
 	conflict markers in the files in the working tree for the user to
 	resolve.  This option implies the `--index` option, and is incompatible
 	with the `--reject` and the `--cached` options.
 	resolve.  This option implies the `--index` option unless the
 	`--cached` option is used, and is incompatible with the `--reject` option.
 	When used with the `--cached` option, any conflicts are left at higher stages
 	in the cache.
 --build-fake-ancestor=<file>::
 	Newer 'git diff' output has embedded 'index information'

2

Documentation/git-blame.txt

View File

 @ -226,7 +226,7 @@ commit commentary), a blame viewer will not care.
 MAPPING AUTHORS
 ---------------
 include::mailmap.txt[]
 See linkgit:gitmailmap[5].
 SEE ALSO

6

Documentation/git-branch.txt

View File

 @ -78,8 +78,8 @@ renaming. If <newbranch> exists, -M must be used to force the rename
 to happen.
 The `-c` and `-C` options have the exact same semantics as `-m` and
 `-M`, except instead of the branch being renamed it along with its
 config and reflog will be copied to a new name.
 `-M`, except instead of the branch being renamed, it will be copied to a
 new name, along with its config and reflog.
 With a `-d` or `-D` option, `<branchname>` will be deleted.  You may
 specify more than one branch for deletion.  If the branch currently
 @ -153,7 +153,7 @@ OPTIONS
 --column[=<options>]::
 --no-column::
 	Display branch listing in columns. See configuration variable
 	column.branch for option syntax.`--column` and `--no-column`
 	`column.branch` for option syntax. `--column` and `--no-column`
 	without options are equivalent to 'always' and 'never' respectively.
 +
 This option is only applicable in non-verbose mode.

67

Documentation/git-cat-file.txt

View File

 @ -35,42 +35,42 @@ OPTIONS
 -t::
 	Instead of the content, show the object type identified by
 	<object>.
 	`<object>`.
 -s::
 	Instead of the content, show the object size identified by
 	<object>.
 	`<object>`.
 -e::
 	Exit with zero status if <object> exists and is a valid
 	object. If <object> is of an invalid format exit with non-zero and
 	Exit with zero status if `<object>` exists and is a valid
 	object. If `<object>` is of an invalid format exit with non-zero and
 	emits an error on stderr.
 -p::
 	Pretty-print the contents of <object> based on its type.
 	Pretty-print the contents of `<object>` based on its type.
 <type>::
 	Typically this matches the real type of <object> but asking
 	Typically this matches the real type of `<object>` but asking
 	for a type that can trivially be dereferenced from the given
 	<object> is also permitted.  An example is to ask for a
 	"tree" with <object> being a commit object that contains it,
 	or to ask for a "blob" with <object> being a tag object that
 	`<object>` is also permitted.  An example is to ask for a
 	"tree" with `<object>` being a commit object that contains it,
 	or to ask for a "blob" with `<object>` being a tag object that
 	points at it.
 --textconv::
 	Show the content as transformed by a textconv filter. In this case,
 	<object> has to be of the form <tree-ish>:<path>, or :<path> in
 	`<object>` has to be of the form `<tree-ish>:<path>`, or `:<path>` in
 	order to apply the filter to the content recorded in the index at
 	<path>.
 	`<path>`.
 --filters::
 	Show the content as converted by the filters configured in
 	the current working tree for the given <path> (i.e. smudge filters,
 	end-of-line conversion, etc). In this case, <object> has to be of
 	the form <tree-ish>:<path>, or :<path>.
 	the current working tree for the given `<path>` (i.e. smudge filters,
 	end-of-line conversion, etc). In this case, `<object>` has to be of
 	the form `<tree-ish>:<path>`, or `:<path>`.
 --path=<path>::
 	For use with --textconv or --filters, to allow specifying an object
 	For use with `--textconv` or `--filters`, to allow specifying an object
 	name and a path separately, e.g. when it is difficult to figure out
 	the revision from which the blob came.
 @ -115,15 +115,15 @@ OPTIONS
 	repository.
 --allow-unknown-type::
 	Allow -s or -t to query broken/corrupt objects of unknown type.
 	Allow `-s` or `-t` to query broken/corrupt objects of unknown type.
 --follow-symlinks::
 	With --batch or --batch-check, follow symlinks inside the
 	With `--batch` or `--batch-check`, follow symlinks inside the
 	repository when requesting objects with extended SHA-1
 	expressions of the form tree-ish:path-in-tree.  Instead of
 	providing output about the link itself, provide output about
 	the linked-to object.  If a symlink points outside the
 	tree-ish (e.g. a link to /foo or a root-level link to ../foo),
 	tree-ish (e.g. a link to `/foo` or a root-level link to `../foo`),
 	the portion of the link which is outside the tree will be
 	printed.
 +
 @ -175,15 +175,15 @@ respectively print:
 OUTPUT
 ------
 If `-t` is specified, one of the <type>.
 If `-t` is specified, one of the `<type>`.
 If `-s` is specified, the size of the <object> in bytes.
 If `-s` is specified, the size of the `<object>` in bytes.
 If `-e` is specified, no output, unless the <object> is malformed.
 If `-e` is specified, no output, unless the `<object>` is malformed.
 If `-p` is specified, the contents of <object> are pretty-printed.
 If `-p` is specified, the contents of `<object>` are pretty-printed.
 If <type> is specified, the raw (though uncompressed) contents of the <object>
 If `<type>` is specified, the raw (though uncompressed) contents of the `<object>`
 will be returned.
 BATCH OUTPUT
 @ -200,7 +200,7 @@ object, with placeholders of the form `%(atom)` expanded, followed by a
 newline. The available atoms are:
 `objectname`::
 	The 40-hex object name of the object.
 	The full hex representation of the object name.
 `objecttype`::
 	The type of the object (the same as `cat-file -t` reports).
 @ -215,8 +215,9 @@ newline. The available atoms are:
 `deltabase`::
 	If the object is stored as a delta on-disk, this expands to the
 -hex sha1 of the delta base object. Otherwise, expands to the
 	null sha1 (40 zeroes). See `CAVEATS` below.
 	full hex representation of the delta base object name.
 	Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
 	below.
 `rest`::
 	If this atom is used in the output string, input lines are split
 @ -235,14 +236,14 @@ newline.
 For example, `--batch` without a custom format would produce:
 ------------
 <sha1> SP <type> SP <size> LF
 <oid> SP <type> SP <size> LF
 <contents> LF
 ------------
 Whereas `--batch-check='%(objectname) %(objecttype)'` would produce:
 ------------
 <sha1> SP <type> LF
 <oid> SP <type> LF
 ------------
 If a name is specified on stdin that cannot be resolved to an object in
 @ -258,7 +259,7 @@ If a name is specified that might refer to more than one object (an ambiguous sh
 <object> SP ambiguous LF
 ------------
 If --follow-symlinks is used, and a symlink in the repository points
 If `--follow-symlinks` is used, and a symlink in the repository points
 outside the repository, then `cat-file` will ignore any custom format
 and print:
 @ -267,11 +268,11 @@ symlink SP <size> LF
 <symlink> LF
 ------------
 The symlink will either be absolute (beginning with a /), or relative
 to the tree root.  For instance, if dir/link points to ../../foo, then
 <symlink> will be ../foo.  <size> is the size of the symlink in bytes.
 The symlink will either be absolute (beginning with a `/`), or relative
 to the tree root.  For instance, if dir/link points to `../../foo`, then
 `<symlink>` will be `../foo`.  `<size>` is the size of the symlink in bytes.
 If --follow-symlinks is used, the following error messages will be
 If `--follow-symlinks` is used, the following error messages will be
 displayed:
 ------------

9

Documentation/git-check-mailmap.txt

View File

 @ -36,10 +36,17 @@ name is provided or known to the 'mailmap', ``Name $$<user@host>$$'' is
 printed; otherwise only ``$$<user@host>$$'' is printed.
 CONFIGURATION
 -------------
 See `mailmap.file` and `mailmap.blob` in linkgit:git-config[1] for how
 to specify a custom `.mailmap` target file or object.
 MAPPING AUTHORS
 ---------------
 include::mailmap.txt[]
 See linkgit:gitmailmap[5].
 GIT

7

Documentation/git-clone.txt

View File

 @ -15,7 +15,7 @@ SYNOPSIS
 	  [--dissociate] [--separate-git-dir <git dir>]
 	  [--depth <depth>] [--[no-]single-branch] [--no-tags]
 	  [--recurse-submodules[=<pathspec>]] [--[no-]shallow-submodules]
 	  [--[no-]remote-submodules] [--jobs <n>] [--sparse]
 	  [--[no-]remote-submodules] [--jobs <n>] [--sparse] [--[no-]reject-shallow]
 	  [--filter=<filter>] [--] <repository>
 	  [<directory>]
 @ -149,6 +149,11 @@ objects from the source repository into a pack in the cloned repository.
 --no-checkout::
 	No checkout of HEAD is performed after the clone is complete.
 --[no-]reject-shallow::
 	Fail if the source repository is a shallow repository.
 	The 'clone.rejectShallow' configuration variable can be used to
 	specify the default.
 --bare::
 	Make a 'bare' Git repository.  That is, instead of
 	creating `<directory>` and placing the administrative

59

Documentation/git-commit.txt

View File

 @ -9,12 +9,13 @@ SYNOPSIS
 --------
 [verse]
 'git commit' [-a | --interactive | --patch] [-s] [-v] [-u<mode>] [--amend]
 	   [--dry-run] [(-c | -C | --fixup | --squash) <commit>]
 	   [--dry-run] [(-c | -C | --squash) <commit> | --fixup [(amend|reword):]<commit>)]
 	   [-F <file> | -m <msg>] [--reset-author] [--allow-empty]
 	   [--allow-empty-message] [--no-verify] [-e] [--author=<author>]
 	   [--date=<date>] [--cleanup=<mode>] [--[no-]status]
 	   [-i | -o] [--pathspec-from-file=<file> [--pathspec-file-nul]]
 	   [-S[<keyid>]] [--] [<pathspec>...]
 	   [(--trailer <token>[(=|:)<value>])...] [-S[<keyid>]]
 	   [--] [<pathspec>...]
 DESCRIPTION
 -----------
 @ -86,11 +87,44 @@ OPTIONS
 	Like '-C', but with `-c` the editor is invoked, so that
 	the user can further edit the commit message.
 --fixup=<commit>::
 	Construct a commit message for use with `rebase --autosquash`.
 	The commit message will be the subject line from the specified
 	commit with a prefix of "fixup! ".  See linkgit:git-rebase[1]
 	for details.
 --fixup=[(amend|reword):]<commit>::
 	Create a new commit which "fixes up" `<commit>` when applied with
 	`git rebase --autosquash`. Plain `--fixup=<commit>` creates a
 	"fixup!" commit which changes the content of `<commit>` but leaves
 	its log message untouched. `--fixup=amend:<commit>` is similar but
 	creates an "amend!" commit which also replaces the log message of
 	`<commit>` with the log message of the "amend!" commit.
 	`--fixup=reword:<commit>` creates an "amend!" commit which
 	replaces the log message of `<commit>` with its own log message
 	but makes no changes to the content of `<commit>`.
 +
 The commit created by plain `--fixup=<commit>` has a subject
 composed of "fixup!" followed by the subject line from <commit>,
 and is recognized specially by `git rebase --autosquash`. The `-m`
 option may be used to supplement the log message of the created
 commit, but the additional commentary will be thrown away once the
 "fixup!" commit is squashed into `<commit>` by
 `git rebase --autosquash`.
 +
 The commit created by `--fixup=amend:<commit>` is similar but its
 subject is instead prefixed with "amend!". The log message of
 <commit> is copied into the log message of the "amend!" commit and
 opened in an editor so it can be refined. When `git rebase
 --autosquash` squashes the "amend!" commit into `<commit>`, the
 log message of `<commit>` is replaced by the refined log message
 from the "amend!" commit. It is an error for the "amend!" commit's
 log message to be empty unless `--allow-empty-message` is
 specified.
 +
 `--fixup=reword:<commit>` is shorthand for `--fixup=amend:<commit>
 --only`. It creates an "amend!" commit with only a log message
 (ignoring any changes staged in the index). When squashed by `git
 rebase --autosquash`, it replaces the log message of `<commit>`
 without making any other changes.
 +
 Neither "fixup!" nor "amend!" commits change authorship of
 `<commit>` when applied by `git rebase --autosquash`.
 See linkgit:git-rebase[1] for details.
 --squash=<commit>::
 	Construct a commit message for use with `rebase --autosquash`.
 @ -166,6 +200,17 @@ The `-m` option is mutually exclusive with `-c`, `-C`, and `-F`.
 include::signoff-option.txt[]
 --trailer <token>[(=|:)<value>]::
 	Specify a (<token>, <value>) pair that should be applied as a
 	trailer. (e.g. `git commit --trailer "Signed-off-by:C O Mitter \
 	<committer@example.com>" --trailer "Helped-by:C O Mitter \
 	<committer@example.com>"` will add the "Signed-off-by" trailer
 	and the "Helped-by" trailer to the commit message.)
 	The `trailer.*` configuration variables
 	(linkgit:git-interpret-trailers[1]) can be used to define if
 	a duplicated trailer is omitted, where in the run of trailers
 	each trailer would appear, and other details.
 -n::
 --no-verify::
 	This option bypasses the pre-commit and commit-msg hooks.

21

Documentation/git-config.txt

View File

 @ -340,12 +340,33 @@ GIT_CONFIG::
 	Using the "--global" option forces this to ~/.gitconfig. Using the
 	"--system" option forces this to $(prefix)/etc/gitconfig.
 GIT_CONFIG_GLOBAL::
 GIT_CONFIG_SYSTEM::
 	Take the configuration from the given files instead from global or
 	system-level configuration. See linkgit:git[1] for details.
 GIT_CONFIG_NOSYSTEM::
 	Whether to skip reading settings from the system-wide
 	$(prefix)/etc/gitconfig file. See linkgit:git[1] for details.
 See also <<FILES>>.
 GIT_CONFIG_COUNT::
 GIT_CONFIG_KEY_<n>::
 GIT_CONFIG_VALUE_<n>::
 	If GIT_CONFIG_COUNT is set to a positive number, all environment pairs
 	GIT_CONFIG_KEY_<n> and GIT_CONFIG_VALUE_<n> up to that number will be
 	added to the process's runtime configuration. The config pairs are
 	zero-indexed. Any missing key or value is treated as an error. An empty
 	GIT_CONFIG_COUNT is treated the same as GIT_CONFIG_COUNT=0, namely no
 	pairs are processed. These environment variables will override values
 	in configuration files, but will be overridden by any explicit options
 	passed via `git -c`.
 +
 This is useful for cases where you want to spawn multiple git commands
 with a common configuration but cannot depend on a configuration file,
 for example when writing scripts.
 [[EXAMPLES]]
 EXAMPLES

4

Documentation/git-credential.txt

View File

 @ -159,3 +159,7 @@ empty string.
 +
 Components which are missing from the URL (e.g., there is no
 username in the example above) will be left unset.
 GIT
 ---
 Part of the linkgit:git[1] suite

24

Documentation/git-cvsserver.txt

View File

 @ -24,6 +24,18 @@ Usage:
 [verse]
 'git-cvsserver' [<options>] [pserver|server] [<directory> ...]
 DESCRIPTION
 -----------
 This application is a CVS emulation layer for Git.
 It is highly functional. However, not all methods are implemented,
 and for those methods that are implemented,
 not all switches are implemented.
 Testing has been done using both the CLI CVS client, and the Eclipse CVS
 plugin. Most functionality works fine with both of these clients.
 OPTIONS
 -------
 @ -57,18 +69,6 @@ access still needs to be enabled by the `gitcvs.enabled` config option
 unless `--export-all` was given, too.
 DESCRIPTION
 -----------
 This application is a CVS emulation layer for Git.
 It is highly functional. However, not all methods are implemented,
 and for those methods that are implemented,
 not all switches are implemented.
 Testing has been done using both the CLI CVS client, and the Eclipse CVS
 plugin. Most functionality works fine with both of these clients.
 LIMITATIONS
 -----------

8

Documentation/git-difftool.txt

View File

 @ -34,6 +34,14 @@ OPTIONS
 	This is the default behaviour; the option is provided to
 	override any configuration settings.
 --rotate-to=<file>::
 	Start showing the diff for the given path,
 	the paths before it will move to end and output.
 --skip-to=<file>::
 	Start showing the diff for the given path, skipping all
 	the paths before it.
 -t <tool>::
 --tool=<tool>::
 	Use the diff tool specified by <tool>.  Valid values include

8

Documentation/git-for-each-ref.txt

View File

 @ -260,11 +260,9 @@ contents:lines=N::
 	The first `N` lines of the message.
 Additionally, the trailers as interpreted by linkgit:git-interpret-trailers[1]
 are obtained as `trailers` (or by using the historical alias
 `contents:trailers`).  Non-trailer lines from the trailer block can be omitted
 with `trailers:only`. Whitespace-continuations can be removed from trailers so
 that each trailer appears on a line by itself with its full content with
 `trailers:unfold`. Both can be used together as `trailers:unfold,only`.
 are obtained as `trailers[:options]` (or by using the historical alias
 `contents:trailers[:options]`). For valid [:option] values see `trailers`
 section of linkgit:git-log[1].
 For sorting purposes, fields with numeric values sort in numeric order
 (`objectsize`, `authordate`, `committerdate`, `creatordate`, `taggerdate`).

34

Documentation/git-format-patch.txt

View File

 @ -36,11 +36,28 @@ SYNOPSIS
 DESCRIPTION
 -----------
 Prepare each commit with its patch in
 one file per commit, formatted to resemble UNIX mailbox format.
 Prepare each non-merge commit with its "patch" in
 one "message" per commit, formatted to resemble a UNIX mailbox.
 The output of this command is convenient for e-mail submission or
 for use with 'git am'.
 A "message" generated by the command consists of three parts:
 * A brief metadata header that begins with `From <commit>`
   with a fixed `Mon Sep 17 00:00:00 2001` datestamp to help programs
   like "file(1)" to recognize that the file is an output from this
   command, fields that record the author identity, the author date,
   and the title of the change (taken from the first paragraph of the
   commit log message).
 * The second and subsequent paragraphs of the commit log message.
 * The "patch", which is the "diff -p --stat" output (see
   linkgit:git-diff[1]) between the commit and its parent.
 The log message and the patch is separated by a line with a
 three-dash line.
 There are two ways to specify which commits to operate on.
 . A single commit, <since>, specifies that the commits leading
 @ -221,6 +238,11 @@ populated with placeholder text.
 	`--subject-prefix` option) has ` v<n>` appended to it.  E.g.
 	`--reroll-count=4` may produce `v4-0001-add-makefile.patch`
 	file that has "Subject: [PATCH v4 1/20] Add makefile" in it.
 	`<n>` does not have to be an integer (e.g. "--reroll-count=4.4",
 	or "--reroll-count=4rev2" are allowed), but the downside of
 	using such a reroll-count is that the range-diff/interdiff
 	with the previous version does not state exactly which
 	version the new interation is compared against.
 --to=<email>::
 	Add a `To:` header to the email headers. This is in addition
 @ -718,6 +740,14 @@ use it only when you know the recipient uses Git to apply your patch.
 $ git format-patch -3
 ------------
 CAVEATS
 -------
 Note that `format-patch` will omit merge commits from the output, even
 if they are part of the requested range. A simple "patch" does not
 include enough information for the receiving end to reproduce the same
 merge commit.
 SEE ALSO
 --------
 linkgit:git-am[1], linkgit:git-send-email[1]

14

Documentation/git-gc.txt

View File

 @ -117,12 +117,14 @@ NOTES
 'git gc' tries very hard not to delete objects that are referenced
 anywhere in your repository. In particular, it will keep not only
 objects referenced by your current set of branches and tags, but also
 objects referenced by the index, remote-tracking branches, notes saved
 by 'git notes' under refs/notes/, reflogs (which may reference commits
 in branches that were later amended or rewound), and anything else in
 the refs/* namespace.  If you are expecting some objects to be deleted
 and they aren't, check all of those locations and decide whether it
 makes sense in your case to remove those references.
 objects referenced by the index, remote-tracking branches, reflogs
 (which may reference commits in branches that were later amended or
 rewound), and anything else in the refs/* namespace. Note that a note
 (of the kind created by 'git notes') attached to an object does not
 contribute in keeping the object alive. If you are expecting some
 objects to be deleted and they aren't, check all of those locations
 and decide whether it makes sense in your case to remove those
 references.
 On the other hand, when 'git gc' runs concurrently with another process,
 there is a risk of it deleting an object that the other process is using

64

Documentation/git-grep.txt

View File

 @ -38,38 +38,6 @@ are lists of one or more search expressions separated by newline
 characters.  An empty string as search expression matches all lines.
 CONFIGURATION
 -------------
 grep.lineNumber::
 	If set to true, enable `-n` option by default.
 grep.column::
 	If set to true, enable the `--column` option by default.
 grep.patternType::
 	Set the default matching behavior. Using a value of 'basic', 'extended',
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
 	`--fixed-strings`, or `--perl-regexp` option accordingly, while the
 	value 'default' will return to the default matching behavior.
 grep.extendedRegexp::
 	If set to true, enable `--extended-regexp` option by default. This
 	option is ignored when the `grep.patternType` option is set to a value
 	other than 'default'.
 grep.threads::
 	Number of grep worker threads to use. If unset (or set to 0), Git will
 	use as many threads as the number of logical cores available.
 grep.fullName::
 	If set to true, enable `--full-name` option by default.
 grep.fallbackToNoIndex::
 	If set to true, fall back to git grep --no-index if git grep
 	is executed outside of a git repository.  Defaults to false.
 OPTIONS
 -------
 --cached::
 @ -363,6 +331,38 @@ with multiple threads might perform slower than single threaded if `--textconv`
 is given and there're too many text conversions. So if you experience low
 performance in this case, it might be desirable to use `--threads=1`.
 CONFIGURATION
 -------------
 grep.lineNumber::
 	If set to true, enable `-n` option by default.
 grep.column::
 	If set to true, enable the `--column` option by default.
 grep.patternType::
 	Set the default matching behavior. Using a value of 'basic', 'extended',
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
 	`--fixed-strings`, or `--perl-regexp` option accordingly, while the
 	value 'default' will return to the default matching behavior.
 grep.extendedRegexp::
 	If set to true, enable `--extended-regexp` option by default. This
 	option is ignored when the `grep.patternType` option is set to a value
 	other than 'default'.
 grep.threads::
 	Number of grep worker threads to use. If unset (or set to 0), Git will
 	use as many threads as the number of logical cores available.
 grep.fullName::
 	If set to true, enable `--full-name` option by default.
 grep.fallbackToNoIndex::
 	If set to true, fall back to git grep --no-index if git grep
 	is executed outside of a git repository.  Defaults to false.
 GIT
 ---
 Part of the linkgit:git[1] suite

10

Documentation/git-http-fetch.txt

View File

 @ -41,11 +41,17 @@ commit-id::
 		<commit-id>['\t'<filename-as-in--w>]
 --packfile=<hash>::
 	Instead of a commit id on the command line (which is not expected in
 	For internal use only. Instead of a commit id on the command
 	line (which is not expected in
 	this case), 'git http-fetch' fetches the packfile directly at the given
 	URL and uses index-pack to generate corresponding .idx and .keep files.
 	The hash is used to determine the name of the temporary file and is
 	arbitrary. The output of index-pack is printed to stdout.
 	arbitrary. The output of index-pack is printed to stdout. Requires
 	--index-pack-args.
 --index-pack-args=<args>::
 	For internal use only. The command to run on the contents of the
 	downloaded pack. Arguments are URL-encoded separated by spaces.
 --recover::
 	Verify that everything reachable from target is fetched.  Used after

25

Documentation/git-index-pack.txt

View File

 @ -9,17 +9,18 @@ git-index-pack - Build pack index file for an existing packed archive
 SYNOPSIS
 --------
 [verse]
 'git index-pack' [-v] [-o <index-file>] <pack-file>
 'git index-pack' [-v] [-o <index-file>] [--[no-]rev-index] <pack-file>
 'git index-pack' --stdin [--fix-thin] [--keep] [-v] [-o <index-file>]
                  [<pack-file>]
 		  [--[no-]rev-index] [<pack-file>]
 DESCRIPTION
 -----------
 Reads a packed archive (.pack) from the specified file, and
 builds a pack index file (.idx) for it.  The packed archive
 together with the pack index can then be placed in the
 objects/pack/ directory of a Git repository.
 builds a pack index file (.idx) for it. Optionally writes a
 reverse-index (.rev) for the specified pack. The packed
 archive together with the pack index can then be placed in
 the objects/pack/ directory of a Git repository.
 OPTIONS
 @ -35,6 +36,13 @@ OPTIONS
 	fails if the name of packed archive does not end
 	with .pack).
 --[no-]rev-index::
 	When this flag is provided, generate a reverse index
 	(a `.rev` file) corresponding to the given pack. If
 	`--verify` is given, ensure that the existing
 	reverse index is correct. Takes precedence over
 	`pack.writeReverseIndex`.
 --stdin::
 	When this flag is provided, the pack is read from stdin
 	instead and a copy is then written to <pack-file>. If
 @ -78,7 +86,12 @@ OPTIONS
 	Die if the pack contains broken links. For internal use only.
 --fsck-objects::
 	Die if the pack contains broken objects. For internal use only.
 	For internal use only.
 +
 Die if the pack contains broken objects. If the pack contains a tree
 pointing to a .gitmodules blob that does not exist, prints the hash of
 that blob (for the caller to check) after the hash that goes into the
 name of the pack/idx file (see "Notes").
 --threads=<n>::
 	Specifies the number of threads to spawn when resolving

94

Documentation/git-interpret-trailers.txt

View File

 @ -232,25 +232,38 @@ trailer.<token>.ifmissing::
 	that option for trailers with the specified <token>.
 trailer.<token>.command::
 	This option can be used to specify a shell command that will
 	be called to automatically add or modify a trailer with the
 	specified <token>.
 	This option behaves in the same way as 'trailer.<token>.cmd', except
 	that it doesn't pass anything as argument to the specified command.
 	Instead the first occurrence of substring $ARG is replaced by the
 	value that would be passed as argument.
 +
 When this option is specified, the behavior is as if a special
 '<token>=<value>' argument were added at the beginning of the command
 line, where <value> is taken to be the standard output of the
 specified command with any leading and trailing whitespace trimmed
 off.
 The 'trailer.<token>.command' option has been deprecated in favor of
 'trailer.<token>.cmd' due to the fact that $ARG in the user's command is
 only replaced once and that the original way of replacing $ARG is not safe.
 +
 If the command contains the `$ARG` string, this string will be
 replaced with the <value> part of an existing trailer with the same
 <token>, if any, before the command is launched.
 When both 'trailer.<token>.cmd' and 'trailer.<token>.command' are given
 for the same <token>, 'trailer.<token>.cmd' is used and
 'trailer.<token>.command' is ignored.
 trailer.<token>.cmd::
 	This option can be used to specify a shell command that will be called:
 	once to automatically add a trailer with the specified <token>, and then
 	each time a '--trailer <token>=<value>' argument to modify the <value> of
 	the trailer that this option would produce.
 +
 If some '<token>=<value>' arguments are also passed on the command
 line, when a 'trailer.<token>.command' is configured, the command will
 also be executed for each of these arguments. And the <value> part of
 these arguments, if any, will be used to replace the `$ARG` string in
 the command.
 When the specified command is first called to add a trailer
 with the specified <token>, the behavior is as if a special
 '--trailer <token>=<value>' argument was added at the beginning
 of the "git interpret-trailers" command, where <value>
 is taken to be the standard output of the command with any
 leading and trailing whitespace trimmed off.
 +
 If some '--trailer <token>=<value>' arguments are also passed
 on the command line, the command is called again once for each
 of these arguments with the same <token>. And the <value> part
 of these arguments, if any, will be passed to the command as its
 first argument. This way the command can produce a <value> computed
 from the <value> passed in the '--trailer <token>=<value>' argument.
 EXAMPLES
 --------
 @ -333,6 +346,55 @@ subject
 Fix #42
 ------------
 * Configure a 'help' trailer with a cmd use a script `glog-find-author`
   which search specified author identity from git log in git repository
   and show how it works:
 +
 ------------
 $ cat ~/bin/glog-find-author
 #!/bin/sh
 test -n "$1" && git log --author="$1" --pretty="%an <%ae>" -1 || true
 $ git config trailer.help.key "Helped-by: "
 $ git config trailer.help.ifExists "addIfDifferentNeighbor"
 $ git config trailer.help.cmd "~/bin/glog-find-author"
 $ git interpret-trailers --trailer="help:Junio" --trailer="help:Couder" <<EOF
 > subject
 >
 > message
 >
 > EOF
 subject
 message
 Helped-by: Junio C Hamano <gitster@pobox.com>
 Helped-by: Christian Couder <christian.couder@gmail.com>
 ------------
 * Configure a 'ref' trailer with a cmd use a script `glog-grep`
   to grep last relevant commit from git log in the git repository
   and show how it works:
 +
 ------------
 $ cat ~/bin/glog-grep
 #!/bin/sh
 test -n "$1" && git log --grep "$1" --pretty=reference -1 || true
 $ git config trailer.ref.key "Reference-to: "
 $ git config trailer.ref.ifExists "replace"
 $ git config trailer.ref.cmd "~/bin/glog-grep"
 $ git interpret-trailers --trailer="ref:Add copyright notices." <<EOF
 > subject
 >
 > message
 >
 > EOF
 subject
 message
 Reference-to: 8bc9a0c769 (Add copyright notices., 2005-04-07)
 ------------
 * Configure a 'see' trailer with a command to show the subject of a
   commit that is related, and show how it works:
 +

46

Documentation/git-log.txt

View File

 @ -107,47 +107,15 @@ DIFF FORMATTING
 By default, `git log` does not generate any diff output. The options
 below can be used to show the changes made by each commit.
 Note that unless one of `-c`, `--cc`, or `-m` is given, merge commits
 will never show a diff, even if a diff format like `--patch` is
 selected, nor will they match search options like `-S`. The exception is
 when `--first-parent` is in use, in which merges are treated like normal
 single-parent commits (this can be overridden by providing a
 combined-diff option or with `--no-diff-merges`).
 -c::
 	With this option, diff output for a merge commit
 	shows the differences from each of the parents to the merge result
 	simultaneously instead of showing pairwise diff between a parent
 	and the result one at a time. Furthermore, it lists only files
 	which were modified from all parents.
 --cc::
 	This flag implies the `-c` option and further compresses the
 	patch output by omitting uninteresting hunks whose contents in
 	the parents have only two variants and the merge result picks
 	one of them without modification.
 --combined-all-paths::
 	This flag causes combined diffs (used for merge commits) to
 	list the name of the file from all parents.  It thus only has
 	effect when -c or --cc are specified, and is likely only
 	useful if filename changes are detected (i.e. when either
 	rename or copy detection have been requested).
 -m::
 	This flag makes the merge commits show the full diff like
 	regular commits; for each merge parent, a separate log entry
 	and diff is generated. An exception is that only diff against
 	the first parent is shown when `--first-parent` option is given;
 	in that case, the output represents the changes the merge
 	brought _into_ the then-current branch.
 --diff-merges=off::
 --no-diff-merges::
 	Disable output of diffs for merge commits (default). Useful to
 	override `-m`, `-c`, or `--cc`.
 Note that unless one of `--diff-merges` variants (including short
 `-m`, `-c`, and `--cc` options) is explicitly given, merge commits
 will not show a diff, even if a diff format like `--patch` is
 selected, nor will they match search options like `-S`. The exception
 is when `--first-parent` is in use, in which case `first-parent` is
 the default format.
 :git-log: 1
 :diff-merges-default: `off`
 include::diff-options.txt[]
 include::diff-generate-patch.txt[]

8

Documentation/git-ls-files.txt

View File

 @ -13,6 +13,7 @@ SYNOPSIS
 		(--[cached|deleted|others|ignored|stage|unmerged|killed|modified])*
 		(-[c|d|o|i|s|u|k|m])*
 		[--eol]
 		[--deduplicate]
 		[-x <pattern>|--exclude=<pattern>]
 		[-X <file>|--exclude-from=<file>]
 		[--exclude-per-directory=<file>]
 @ -80,6 +81,13 @@ OPTIONS
 	\0 line termination on output and do not quote filenames.
 	See OUTPUT below for more information.
 --deduplicate::
 	When only filenames are shown, suppress duplicates that may
 	come from having multiple stages during a merge, or giving
 	`--deleted` and `--modified` option at the same time.
 	When any of the `-t`, `--unmerged`, or `--stage` option is
 	in use, this option has no effect.
 -x <pattern>::
 --exclude=<pattern>::
 	Skip untracked files matching pattern.

25

Documentation/git-mailinfo.txt

View File

 @ -9,7 +9,9 @@ git-mailinfo - Extracts patch and authorship from a single e-mail message
 SYNOPSIS
 --------
 [verse]
 'git mailinfo' [-k|-b] [-u | --encoding=<encoding> | -n] [--[no-]scissors] <msg> <patch>
 'git mailinfo' [-k|-b] [-u | --encoding=<encoding> | -n]
 	       [--[no-]scissors] [--quoted-cr=<action>]
 	       <msg> <patch>
 DESCRIPTION
 @ -53,7 +55,7 @@ character.
 	The commit log message, author name and author email are
 	taken from the e-mail, and after minimally decoding MIME
 	transfer encoding, re-coded in the charset specified by
 	i18n.commitencoding (defaulting to UTF-8) by transliterating
 	`i18n.commitEncoding` (defaulting to UTF-8) by transliterating
 	them.  This used to be optional but now it is the default.
 +
 Note that the patch is always used as-is without charset
 @ -61,7 +63,7 @@ conversion, even with this flag.
 --encoding=<encoding>::
 	Similar to -u.  But when re-coding, the charset specified here is
 	used instead of the one specified by i18n.commitencoding or UTF-8.
 	used instead of the one specified by `i18n.commitEncoding` or UTF-8.
 -n::
 	Disable all charset re-coding of the metadata.
 @ -89,6 +91,23 @@ This can be enabled by default with the configuration option mailinfo.scissors.
 --no-scissors::
 	Ignore scissors lines. Useful for overriding mailinfo.scissors settings.
 --quoted-cr=<action>::
 	Action when processes email messages sent with base64 or
 	quoted-printable encoding, and the decoded lines end with a CRLF
 	instead of a simple LF.
 +
 The valid actions are:
 +
 --
 *	`nowarn`: Git will do nothing when such a CRLF is found.
 *	`warn`: Git will issue a warning for each message if such a CRLF is
 	found.
 *	`strip`: Git will convert those CRLF to LF.
 --
 +
 The default action could be set by configuration option `mailinfo.quotedCR`.
 If no such configuration option has been set, `warn` will be used.
 <msg>::
 	The commit log message extracted from e-mail, usually
 	except the title line which comes from e-mail Subject.

128

Documentation/git-maintenance.txt

View File

 @ -92,10 +92,8 @@ commit-graph::
 prefetch::
 	The `prefetch` task updates the object directory with the latest
 	objects from all registered remotes. For each remote, a `git fetch`
 	command is run. The refmap is custom to avoid updating local or remote
 	branches (those in `refs/heads` or `refs/remotes`). Instead, the
 	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
 	not updated.
 	command is run. The configured refspec is modified to place all
 	requested refs within `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
 @ -145,6 +143,12 @@ incremental-repack::
 	which is a special case that attempts to repack all pack-files
 	into a single pack-file.
 pack-refs::
 	The `pack-refs` task collects the loose reference files and
 	collects them into a single file. This speeds up operations that
 	need to iterate across many references. See linkgit:git-pack-refs[1]
 	for more information.
 OPTIONS
 -------
 --auto::
 @ -218,6 +222,122 @@ Further, the `git gc` command should not be combined with
 but does not take the lock in the same way as `git maintenance run`. If
 possible, use `git maintenance run --task=gc` instead of `git gc`.
 The following sections describe the mechanisms put in place to run
 background maintenance by `git maintenance start` and how to customize
 them.
 BACKGROUND MAINTENANCE ON POSIX SYSTEMS
 ---------------------------------------
 The standard mechanism for scheduling background tasks on POSIX systems
 is cron(8). This tool executes commands based on a given schedule. The
 current list of user-scheduled tasks can be found by running `crontab -l`.
 The schedule written by `git maintenance start` is similar to this:
 -----------------------------------------------------------------------
 # BEGIN GIT MAINTENANCE SCHEDULE
 # The following schedule was created by Git
 # Any edits made in this region might be
 # replaced in the future by a Git command.
 1-23 * * * "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=hourly
 0 * * 1-6 "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=daily
 0 * * 0 "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=weekly
 # END GIT MAINTENANCE SCHEDULE
 -----------------------------------------------------------------------
 The comments are used as a region to mark the schedule as written by Git.
 Any modifications within this region will be completely deleted by
 `git maintenance stop` or overwritten by `git maintenance start`.
 The `crontab` entry specifies the full path of the `git` executable to
 ensure that the executed `git` command is the same one with which
 `git maintenance start` was issued independent of `PATH`. If the same user
 runs `git maintenance start` with multiple Git executables, then only the
 latest executable is used.
 These commands use `git for-each-repo --config=maintenance.repo` to run
 `git maintenance run --schedule=<frequency>` on each repository listed in
 the multi-valued `maintenance.repo` config option. These are typically
 loaded from the user-specific global config. The `git maintenance` process
 then determines which maintenance tasks are configured to run on each
 repository with each `<frequency>` using the `maintenance.<task>.schedule`
 config options. These values are loaded from the global or repository
 config values.
 If the config values are insufficient to achieve your desired background
 maintenance schedule, then you can create your own schedule. If you run
 `crontab -e`, then an editor will load with your user-specific `cron`
 schedule. In that editor, you can add your own schedule lines. You could
 start by adapting the default schedule listed earlier, or you could read
 the crontab(5) documentation for advanced scheduling techniques. Please
 do use the full path and `--exec-path` techniques from the default
 schedule to ensure you are executing the correct binaries in your
 schedule.
 BACKGROUND MAINTENANCE ON MACOS SYSTEMS
 ---------------------------------------
 While macOS technically supports `cron`, using `crontab -e` requires
 elevated privileges and the executed process does not have a full user
 context. Without a full user context, Git and its credential helpers
 cannot access stored credentials, so some maintenance tasks are not
 functional.
 Instead, `git maintenance start` interacts with the `launchctl` tool,
 which is the recommended way to schedule timed jobs in macOS. Scheduling
 maintenance through `git maintenance (start|stop)` requires some
 `launchctl` features available only in macOS 10.11 or later.
 Your user-specific scheduled tasks are stored as XML-formatted `.plist`
 files in `~/Library/LaunchAgents/`. You can see the currently-registered
 tasks using the following command:
 -----------------------------------------------------------------------
 $ ls ~/Library/LaunchAgents/org.git-scm.git*
 org.git-scm.git.daily.plist
 org.git-scm.git.hourly.plist
 org.git-scm.git.weekly.plist
 -----------------------------------------------------------------------
 One task is registered for each `--schedule=<frequency>` option. To
 inspect how the XML format describes each schedule, open one of these
 `.plist` files in an editor and inspect the `<array>` element following
 the `<key>StartCalendarInterval</key>` element.
 `git maintenance start` will overwrite these files and register the
 tasks again with `launchctl`, so any customizations should be done by
 creating your own `.plist` files with distinct names. Similarly, the
 `git maintenance stop` command will unregister the tasks with `launchctl`
 and delete the `.plist` files.
 To create more advanced customizations to your background tasks, see
 launchctl.plist(5) for more information.
 BACKGROUND MAINTENANCE ON WINDOWS SYSTEMS
 -----------------------------------------
 Windows does not support `cron` and instead has its own system for
 scheduling background tasks. The `git maintenance start` command uses
 the `schtasks` command to submit tasks to this system. You can inspect
 all background tasks using the Task Scheduler application. The tasks
 added by Git have names of the form `Git Maintenance (<frequency>)`.
 The Task Scheduler GUI has ways to inspect these tasks, but you can also
 export the tasks to XML files and view the details there.
 Note that since Git is a console application, these background tasks
 create a console window visible to the current user. This can be changed
 manually by selecting the "Run whether user is logged in or not" option
 in Task Scheduler. This change requires a password input, which is why
 `git maintenance start` does not select it by default.
 If you want to customize the background tasks, please rename the tasks
 so future calls to `git maintenance (start|stop)` do not overwrite your
 custom tasks.
 GIT
 ---

4

Documentation/git-mergetool--lib.txt

View File

 @ -38,6 +38,10 @@ get_merge_tool_cmd::
 get_merge_tool_path::
 	returns the custom path for a merge tool.
 initialize_merge_tool::
 	bring merge tool specific functions into scope so they can be used or
 	overridden.
 run_merge_tool::
 	launches a merge tool given the tool name and a true/false
 	flag to indicate whether a merge base is present.

4

Documentation/git-mergetool.txt

View File

 @ -99,6 +99,10 @@ success of the resolution after the custom tool has exited.
 	(see linkgit:git-config[1]).  To cancel `diff.orderFile`,
 	use `-O/dev/null`.
 CONFIGURATION
 -------------
 include::config/mergetool.txt[]
 TEMPORARY FILES
 ---------------
 `git mergetool` creates `*.orig` backup files while resolving merges.

39

Documentation/git-mktag.txt

View File

 @ -3,7 +3,7 @@ git-mktag(1)
 NAME
 ----
 git-mktag - Creates a tag object
 git-mktag - Creates a tag object with extra validation
 SYNOPSIS
 @ -13,23 +13,50 @@ SYNOPSIS
 DESCRIPTION
 -----------
 Reads a tag contents on standard input and creates a tag object
 that can also be used to sign other objects.
 The output is the new tag's <object> identifier.
 Reads a tag contents on standard input and creates a tag object. The
 output is the new tag's <object> identifier.
 This command is mostly equivalent to linkgit:git-hash-object[1]
 invoked with `-t tag -w --stdin`. I.e. both of these will create and
 write a tag found in `my-tag`:
     git mktag <my-tag
     git hash-object -t tag -w --stdin <my-tag
 The difference is that mktag will die before writing the tag if the
 tag doesn't pass a linkgit:git-fsck[1] check.
 The "fsck" check done mktag is stricter than what linkgit:git-fsck[1]
 would run by default in that all `fsck.<msg-id>` messages are promoted
 from warnings to errors (so e.g. a missing "tagger" line is an error).
 Extra headers in the object are also an error under mktag, but ignored
 by linkgit:git-fsck[1]. This extra check can be turned off by setting
 the appropriate `fsck.<msg-id>` varible:
     git -c fsck.extraHeaderEntry=ignore mktag <my-tag-with-headers
 OPTIONS
 -------
 --strict::
 	By default mktag turns on the equivalent of
 	linkgit:git-fsck[1] `--strict` mode. Use `--no-strict` to
 	disable it.
 Tag Format
 ----------
 A tag signature file, to be fed to this command's standard input,
 has a very simple fixed format: four lines of
   object <sha1>
   object <hash>
   type <typename>
   tag <tagname>
   tagger <tagger>
 followed by some 'optional' free-form message (some tags created
 by older Git may not have `tagger` line).  The message, when
 by older Git may not have `tagger` line).  The message, when it
 exists, is separated by a blank line from the header.  The
 message part may contain a signature that Git itself doesn't
 care about, but that can be verified with gpg.

14

Documentation/git-multi-pack-index.txt

View File

 @ -9,7 +9,8 @@ git-multi-pack-index - Write and verify multi-pack-indexes
 SYNOPSIS
 --------
 [verse]
 'git multi-pack-index' [--object-dir=<dir>] [--[no-]progress] <subcommand>
 'git multi-pack-index' [--object-dir=<dir>] [--[no-]progress]
 	[--preferred-pack=<pack>] <subcommand>
 DESCRIPTION
 -----------
 @ -30,7 +31,16 @@ OPTIONS
 The following subcommands are available:
 write::
 	Write a new MIDX file.
 	Write a new MIDX file. The following options are available for
 	the `write` sub-command:
 +
 --
 	--preferred-pack=<pack>::
 		Optionally specify the tie-breaking pack used when
 		multiple packs contain the same object. If not given,
 		ties are broken in favor of the pack with the lowest
 		mtime.
 --
 verify::
 	Verify the contents of the MIDX file.

4

Documentation/git-p4.txt

View File

 @ -762,3 +762,7 @@ IMPLEMENTATION DETAILS
   message indicating the p4 depot location and change number.  This
   line is used by later 'git p4 sync' operations to know which p4
   changes are new.
 GIT
 ---
 Part of the linkgit:git[1] suite

21

Documentation/git-pack-objects.txt

View File

 @ -85,6 +85,16 @@ base-name::
 	reference was included in the resulting packfile.  This
 	can be useful to send new tags to native Git clients.
 --stdin-packs::
 	Read the basenames of packfiles (e.g., `pack-1234abcd.pack`)
 	from the standard input, instead of object names or revision
 	arguments. The resulting pack contains all objects listed in the
 	included packs (those not beginning with `^`), excluding any
 	objects listed in the excluded packs (beginning with `^`).
 +
 Incompatible with `--revs`, or options that imply `--revs` (such as
 `--all`), with the exception of `--unpacked`, which is compatible.
 --window=<n>::
 --depth=<n>::
 	These two options affect how the objects contained in
 @ -400,6 +410,17 @@ Note that we pick a single island for each regex to go into, using "last
 one wins" ordering (which allows repo-specific config to take precedence
 over user-wide config, and so forth).
 CONFIGURATION
 -------------
 Various configuration variables affect packing, see
 linkgit:git-config[1] (search for "pack" and "delta").
 Notably, delta compression is not used on objects larger than the
 `core.bigFileThreshold` configuration variable and on files with the
 attribute `delta` set to false.
 SEE ALSO
 --------
 linkgit:git-rev-list[1]

2

Documentation/git-push.txt

View File

 @ -600,7 +600,7 @@ EXAMPLES
 `git push origin`::
 	Without additional configuration, pushes the current branch to
 	the configured upstream (`remote.origin.merge` configuration
 	the configured upstream (`branch.<name>.merge` configuration
 	variable) if it has the same name as the current branch, and
 	errors out without pushing otherwise.
 +

20

Documentation/git-range-diff.txt

View File

 @ -10,6 +10,7 @@ SYNOPSIS
 [verse]
 'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
 	[--no-dual-color] [--creation-factor=<factor>]
 	[--left-only | --right-only]
 	( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
 DESCRIPTION
 @ -28,6 +29,17 @@ Finally, the list of matching commits is shown in the order of the
 second commit range, with unmatched commits being inserted just after
 all of their ancestors have been shown.
 There are three ways to specify the commit ranges:
 - `<range1> <range2>`: Either commit range can be of the form
   `<base>..<rev>`, `<rev>^!` or `<rev>^-<n>`. See `SPECIFYING RANGES`
   in linkgit:gitrevisions[7] for more details.
 - `<rev1>...<rev2>`. This is equivalent to
   `<rev2>..<rev1> <rev1>..<rev2>`.
 - `<base> <rev1> <rev2>`: This is equivalent to `<base>..<rev1>
   <base>..<rev2>`.
 OPTIONS
 -------
 @ -57,6 +69,14 @@ to revert to color all lines according to the outer diff markers
 	See the ``Algorithm`` section below for an explanation why this is
 	needed.
 --left-only::
 	Suppress commits that are missing from the first specified range
 	(or the "left range" when using the `<rev1>...<rev2>` format).
 --right-only::
 	Suppress commits that are missing from the second specified range
 	(or the "right range" when using the `<rev1>...<rev2>` format).
 --[no-]notes[=<ref>]::
 	This flag is passed to the `git log` program
 	(see linkgit:git-log[1]) that generates the patches.

55

Documentation/git-rebase.txt

View File

 @ -200,12 +200,6 @@ Alternatively, you can undo the 'git rebase' with
     git rebase --abort
 CONFIGURATION
 -------------
 include::config/rebase.txt[]
 include::config/sequencer.txt[]
 OPTIONS
 -------
 --onto <newbase>::
 @ -593,16 +587,17 @@ See also INCOMPATIBLE OPTIONS below.
 --autosquash::
 --no-autosquash::
 	When the commit log message begins with "squash! ..." (or
 	"fixup! ..."), and there is already a commit in the todo list that
 	matches the same `...`, automatically modify the todo list of rebase
 	-i so that the commit marked for squashing comes right after the
 	commit to be modified, and change the action of the moved commit
 	from `pick` to `squash` (or `fixup`).  A commit matches the `...` if
 	the commit subject matches, or if the `...` refers to the commit's
 	hash. As a fall-back, partial matches of the commit subject work,
 	too.  The recommended way to create fixup/squash commits is by using
 	the `--fixup`/`--squash` options of linkgit:git-commit[1].
 	When the commit log message begins with "squash! ..." or "fixup! ..."
 	or "amend! ...", and there is already a commit in the todo list that
 	matches the same `...`, automatically modify the todo list of
 	`rebase -i`, so that the commit marked for squashing comes right after
 	the commit to be modified, and change the action of the moved commit
 	from `pick` to `squash` or `fixup` or `fixup -C` respectively. A commit
 	matches the `...` if the commit subject matches, or if the `...` refers
 	to the commit's hash. As a fall-back, partial matches of the commit
 	subject work, too. The recommended way to create fixup/amend/squash
 	commits is by using the `--fixup`, `--fixup=amend:` or `--fixup=reword:`
 	and `--squash` options respectively of linkgit:git-commit[1].
 +
 If the `--autosquash` option is enabled by default using the
 configuration variable `rebase.autoSquash`, this option can be
 @ -622,6 +617,14 @@ See also INCOMPATIBLE OPTIONS below.
 --no-reschedule-failed-exec::
 	Automatically reschedule `exec` commands that failed. This only makes
 	sense in interactive mode (or when an `--exec` option was provided).
 +
 Even though this option applies once a rebase is started, it's set for
 the whole rebase at the start based on either the
 `rebase.rescheduleFailedExec` configuration (see linkgit:git-config[1]
 or "CONFIGURATION" below) or whether this option is
 provided. Otherwise an explicit `--no-reschedule-failed-exec` at the
 start would be overridden by the presence of
 `rebase.rescheduleFailedExec=true` configuration.
 INCOMPATIBLE OPTIONS
 --------------------
 @ -887,9 +890,17 @@ If you want to fold two or more commits into one, replace the command
 "pick" for the second and subsequent commits with "squash" or "fixup".
 If the commits had different authors, the folded commit will be
 attributed to the author of the first commit.  The suggested commit
 message for the folded commit is the concatenation of the commit
 messages of the first commit and of those with the "squash" command,
 but omits the commit messages of commits with the "fixup" command.
 message for the folded commit is the concatenation of the first
 commit's message with those identified by "squash" commands, omitting the
 messages of commits identified by "fixup" commands, unless "fixup -c"
 is used.  In that case the suggested commit message is only the message
 of the "fixup -c" commit, and an editor is opened allowing you to edit
 the message.  The contents (patch) of the "fixup -c" commit are still
 incorporated into the folded commit. If there is more than one "fixup -c"
 commit, the message from the final one is used.  You can also use
 "fixup -C" to get the same behavior as "fixup -c" except without opening
 an editor.
 'git rebase' will stop when "pick" has been replaced with "edit" or
 when a command fails due to merge errors. When you are done editing
 @ -1257,6 +1268,12 @@ merge tlsv1.3
 merge cmake
 ------------
 CONFIGURATION
 -------------
 include::config/rebase.txt[]
 include::config/sequencer.txt[]
 BUGS
 ----
 The todo list presented by the deprecated `--preserve-merges --interactive`

32

Documentation/git-repack.txt

View File

 @ -165,9 +165,35 @@ depth is 4095.
 	Pass the `--delta-islands` option to `git-pack-objects`, see
 	linkgit:git-pack-objects[1].
 Configuration
 -g=<factor>::
 --geometric=<factor>::
 	Arrange resulting pack structure so that each successive pack
 	contains at least `<factor>` times the number of objects as the
 	next-largest pack.
 +
 `git repack` ensures this by determining a "cut" of packfiles that need
 to be repacked into one in order to ensure a geometric progression. It
 picks the smallest set of packfiles such that as many of the larger
 packfiles (by count of objects contained in that pack) may be left
 intact.
 +
 Unlike other repack modes, the set of objects to pack is determined
 uniquely by the set of packs being "rolled-up"; in other words, the
 packs determined to need to be combined in order to restore a geometric
 progression.
 +
 When `--unpacked` is specified, loose objects are implicitly included in
 this "roll-up", without respect to their reachability. This is subject
 to change in the future. This option (implying a drastically different
 repack mode) is not guaranteed to work with all other combinations of
 option to `git repack`.
 CONFIGURATION
 -------------
 Various configuration variables affect packing, see
 linkgit:git-config[1] (search for "pack" and "delta").
 By default, the command passes `--delta-base-offset` option to
 'git pack-objects'; this typically results in slightly smaller packs,
 but the generated packs are incompatible with versions of Git older than
 @ -178,6 +204,10 @@ need to set the configuration variable `repack.UseDeltaBaseOffset` to
 is unaffected by this option as the conversion is performed on the fly
 as needed in that case.
 Delta compression is not used on objects larger than the
 `core.bigFileThreshold` configuration variable and on files with the
 attribute `delta` set to false.
 SEE ALSO
 --------
 linkgit:git-pack-objects[1]

93

Documentation/git-rev-list.txt

View File

 @ -31,6 +31,99 @@ include::rev-list-options.txt[]
 include::pretty-formats.txt[]
 EXAMPLES
 --------
 * Print the list of commits reachable from the current branch.
 +
 ----------
 git rev-list HEAD
 ----------
 * Print the list of commits on this branch, but not present in the
   upstream branch.
 +
 ----------
 git rev-list @{upstream}..HEAD
 ----------
 * Format commits with their author and commit message (see also the
   porcelain linkgit:git-log[1]).
 +
 ----------
 git rev-list --format=medium HEAD
 ----------
 * Format commits along with their diffs (see also the porcelain
   linkgit:git-log[1], which can do this in a single process).
 +
 ----------
 git rev-list HEAD |
 git diff-tree --stdin --format=medium -p
 ----------
 * Print the list of commits on the current branch that touched any
   file in the `Documentation` directory.
 +
 ----------
 git rev-list HEAD -- Documentation/
 ----------
 * Print the list of commits authored by you in the past year, on
   any branch, tag, or other ref.
 +
 ----------
 git rev-list --author=you@example.com --since=1.year.ago --all
 ----------
 * Print the list of objects reachable from the current branch (i.e., all
   commits and the blobs and trees they contain).
 +
 ----------
 git rev-list --objects HEAD
 ----------
 * Compare the disk size of all reachable objects, versus those
   reachable from reflogs, versus the total packed size. This can tell
   you whether running `git repack -ad` might reduce the repository size
   (by dropping unreachable objects), and whether expiring reflogs might
   help.
 +
 ----------
 # reachable objects
 git rev-list --disk-usage --objects --all
 # plus reflogs
 git rev-list --disk-usage --objects --all --reflog
 # total disk size used
 du -c .git/objects/pack/*.pack .git/objects/??/*
 # alternative to du: add up "size" and "size-pack" fields
 git count-objects -v
 ----------
 * Report the disk size of each branch, not including objects used by the
   current branch. This can find outliers that are contributing to a
   bloated repository size (e.g., because somebody accidentally committed
   large build artifacts).
 +
 ----------
 git for-each-ref --format='%(refname)' |
 while read branch
 do
 	size=$(git rev-list --disk-usage --objects HEAD..$branch)
 	echo "$size $branch"
 done |
 sort -n
 ----------
 * Compare the on-disk size of branches in one group of refs, excluding
   another. If you co-mingle objects from multiple remotes in a single
   repository, this can show which remotes are contributing to the
   repository size (taking the size of `origin` as a baseline).
 +
 ----------
 git rev-list --disk-usage --objects --remotes=$suspect --not --remotes=origin
 ----------
 GIT
 ---
 Part of the linkgit:git[1] suite

74

Documentation/git-rev-parse.txt

View File

 @ -212,6 +212,18 @@ Options for Files
 	Only the names of the variables are listed, not their value,
 	even if they are set.
 --path-format=(absolute|relative)::
 	Controls the behavior of certain other options. If specified as absolute, the
 	paths printed by those options will be absolute and canonical. If specified as
 	relative, the paths will be relative to the current working directory if that
 	is possible.  The default is option specific.
 +
 This option may be specified multiple times and affects only the arguments that
 follow it on the command line, either to the end of the command line or the next
 instance of this option.
 The following options are modified by `--path-format`:
 --git-dir::
 	Show `$GIT_DIR` if defined. Otherwise show the path to
 	the .git directory. The path shown, when relative, is
 @ -221,13 +233,42 @@ If `$GIT_DIR` is not defined and the current directory
 is not detected to lie in a Git repository or work tree
 print a message to stderr and exit with nonzero status.
 --git-common-dir::
 	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
 --resolve-git-dir <path>::
 	Check if <path> is a valid repository or a gitfile that
 	points at a valid repository, and print the location of the
 	repository.  If <path> is a gitfile then the resolved path
 	to the real repository is printed.
 --git-path <path>::
 	Resolve "$GIT_DIR/<path>" and takes other path relocation
 	variables such as $GIT_OBJECT_DIRECTORY,
 	$GIT_INDEX_FILE... into account. For example, if
 	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
 	--git-path objects/abc" returns /foo/bar/abc.
 --show-toplevel::
 	Show the (by default, absolute) path of the top-level directory
 	of the working tree. If there is no working tree, report an error.
 --show-superproject-working-tree::
 	Show the absolute path of the root of the superproject's
 	working tree (if exists) that uses the current repository as
 	its submodule.  Outputs nothing if the current repository is
 	not used as a submodule by any project.
 --shared-index-path::
 	Show the path to the shared index file in split index mode, or
 	empty if not in split-index mode.
 The following options are unaffected by `--path-format`:
 --absolute-git-dir::
 	Like `--git-dir`, but its output is always the canonicalized
 	absolute path.
 --git-common-dir::
 	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
 --is-inside-git-dir::
 	When the current working directory is below the repository
 	directory print "true", otherwise "false".
 @ -242,19 +283,6 @@ print a message to stderr and exit with nonzero status.
 --is-shallow-repository::
 	When the repository is shallow print "true", otherwise "false".
 --resolve-git-dir <path>::
 	Check if <path> is a valid repository or a gitfile that
 	points at a valid repository, and print the location of the
 	repository.  If <path> is a gitfile then the resolved path
 	to the real repository is printed.
 --git-path <path>::
 	Resolve "$GIT_DIR/<path>" and takes other path relocation
 	variables such as $GIT_OBJECT_DIRECTORY,
 	$GIT_INDEX_FILE... into account. For example, if
 	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
 	--git-path objects/abc" returns /foo/bar/abc.
 --show-cdup::
 	When the command is invoked from a subdirectory, show the
 	path of the top-level directory relative to the current
 @ -265,20 +293,6 @@ print a message to stderr and exit with nonzero status.
 	path of the current directory relative to the top-level
 	directory.
 --show-toplevel::
 	Show the absolute path of the top-level directory of the working
 	tree. If there is no working tree, report an error.
 --show-superproject-working-tree::
 	Show the absolute path of the root of the superproject's
 	working tree (if exists) that uses the current repository as
 	its submodule.  Outputs nothing if the current repository is
 	not used as a submodule by any project.
 --shared-index-path::
 	Show the path to the shared index file in split index mode, or
 	empty if not in split-index mode.
 --show-object-format[=(storage|input|output)]::
 	Show the object format (hash algorithm) used for the repository
 	for storage inside the `.git` directory, input, or output. For

4

Documentation/git-rm.txt

View File

 @ -23,7 +23,9 @@ branch, and no updates to their contents can be staged in the index,
 though that default behavior can be overridden with the `-f` option.
 When `--cached` is given, the staged content has to
 match either the tip of the branch or the file on disk,
 allowing the file to be removed from just the index.
 allowing the file to be removed from just the index. When
 sparse-checkouts are in use (see linkgit:git-sparse-checkout[1]),
 `git rm` will only remove paths within the sparse-checkout patterns.
 OPTIONS

8

Documentation/git-shortlog.txt

View File

 @ -111,11 +111,11 @@ include::rev-list-options.txt[]
 MAPPING AUTHORS
 ---------------
 The `.mailmap` feature is used to coalesce together commits by the same
 person in the shortlog, where their name and/or email address was
 spelled differently.
 See linkgit:gitmailmap[5].
 include::mailmap.txt[]
 Note that if `git shortlog` is run outside of a repository (to process
 log contents on standard input), it will look for a `.mailmap` file in
 the current directory.
 GIT
 ---

7

Documentation/git-show.txt

View File

 @ -45,10 +45,13 @@ include::pretty-options.txt[]
 include::pretty-formats.txt[]
 COMMON DIFF OPTIONS
 -------------------
 DIFF FORMATTING
 ---------------
 The options below can be used to change the way `git show` generates
 diff output.
 :git-log: 1
 :diff-merges-default: `dense-combined`
 include::diff-options.txt[]
 include::diff-generate-patch.txt[]

14

Documentation/git-sparse-checkout.txt

View File

 @ -45,6 +45,20 @@ To avoid interfering with other worktrees, it first enables the
 When `--cone` is provided, the `core.sparseCheckoutCone` setting is
 also set, allowing for better performance with a limited set of
 patterns (see 'CONE PATTERN SET' below).
 +
 Use the `--[no-]sparse-index` option to toggle the use of the sparse
 index format. This reduces the size of the index to be more closely
 aligned with your sparse-checkout definition. This can have significant
 performance advantages for commands such as `git status` or `git add`.
 This feature is still experimental. Some commands might be slower with
 a sparse index until they are properly integrated with the feature.
 +
 **WARNING:** Using a sparse index requires modifying the index in a way
 that is not completely understood by external tools. If you have trouble
 with this compatibility, then run `git sparse-checkout init --no-sparse-index`
 to rewrite your index to not be sparse. Older versions of Git will not
 understand the sparse directory entries index extension and may fail to
 interact with your repository until it is disabled.
 'set'::
 	Write a set of patterns to the sparse-checkout file, as given as

26

Documentation/git-stash.txt

View File

 @ -8,8 +8,8 @@ git-stash - Stash the changes in a dirty working directory away
 SYNOPSIS
 --------
 [verse]
 'git stash' list [<options>]
 'git stash' show [<options>] [<stash>]
 'git stash' list [<log-options>]
 'git stash' show [-u|--include-untracked|--only-untracked] [<diff-options>] [<stash>]
 'git stash' drop [-q|--quiet] [<stash>]
 'git stash' ( pop | apply ) [--index] [-q|--quiet] [<stash>]
 'git stash' branch <branchname> [<stash>]
 @ -67,7 +67,7 @@ save [-p|--patch] [-k|--[no-]keep-index] [-u|--include-untracked] [-a|--all] [-q
 	Instead, all non-option arguments are concatenated to form the stash
 	message.
 list [<options>]::
 list [<log-options>]::
 	List the stash entries that you currently have.  Each 'stash entry' is
 	listed with its name (e.g. `stash@{0}` is the latest entry, `stash@{1}` is
 @ -83,7 +83,7 @@ stash@{1}: On master: 9cc0589... Add git-stash
 The command takes options applicable to the 'git log'
 command to control what is shown and how. See linkgit:git-log[1].
 show [<options>] [<stash>]::
 show [-u|--include-untracked|--only-untracked] [<diff-options>] [<stash>]::
 	Show the changes recorded in the stash entry as a diff between the
 	stashed contents and the commit back when the stash entry was first
 @ -91,8 +91,8 @@ show [<options>] [<stash>]::
 	By default, the command shows the diffstat, but it will accept any
 	format known to 'git diff' (e.g., `git stash show -p stash@{1}`
 	to view the second most recent entry in patch form).
 	You can use stash.showStat and/or stash.showPatch config variables
 	to change the default behavior.
 	You can use stash.showIncludeUntracked, stash.showStat, and
 	stash.showPatch config variables to change the default behavior.
 pop [--index] [-q|--quiet] [<stash>]::
 @ -160,10 +160,18 @@ up with `git clean`.
 -u::
 --include-untracked::
 	This option is only valid for `push` and `save` commands.
 --no-include-untracked::
 	When used with the `push` and `save` commands,
 	all untracked files are also stashed and then cleaned up with
 	`git clean`.
 +
 All untracked files are also stashed and then cleaned up with
 `git clean`.
 When used with the `show` command, show the untracked files in the stash
 entry as part of the diff.
 --only-untracked::
 	This option is only valid for the `show` command.
 +
 Show only the untracked files in the stash entry as part of the diff.
 --index::
 	This option is only valid for `pop` and `apply` commands.

2

Documentation/git-status.txt

View File

 @ -130,7 +130,7 @@ ignored, then the directory is not shown, but all contents are shown.
 --column[=<options>]::
 --no-column::
 	Display untracked files in columns. See configuration variable
 	column.status for option syntax.`--column` and `--no-column`
 	`column.status` for option syntax. `--column` and `--no-column`
 	without options are equivalent to 'always' and 'never'
 	respectively.

38

Documentation/git-svn.txt

View File

 @ -1061,25 +1061,6 @@ with different name spaces.  For example:
 	branches = stable/*:refs/remotes/svn/stable/*
 	branches = debug/*:refs/remotes/svn/debug/*
 BUGS
 ----
 We ignore all SVN properties except svn:executable.  Any unhandled
 properties are logged to $GIT_DIR/svn/<refname>/unhandled.log
 Renamed and copied directories are not detected by Git and hence not
 tracked when committing to SVN.  I do not plan on adding support for
 this as it's quite difficult and time-consuming to get working for all
 the possible corner cases (Git doesn't do it, either).  Committing
 renamed and copied files is fully supported if they're similar enough
 for Git to detect them.
 In SVN, it is possible (though discouraged) to commit changes to a tag
 (because a tag is just a directory copy, thus technically the same as a
 branch). When cloning an SVN repository, 'git svn' cannot know if such a
 commit to a tag will happen in the future. Thus it acts conservatively
 and imports all SVN tags as branches, prefixing the tag name with 'tags/'.
 CONFIGURATION
 -------------
 @ -1166,6 +1147,25 @@ $GIT_DIR/svn/\**/.rev_map.*::
 if it is missing or not up to date.  'git svn reset' automatically
 rewinds it.
 BUGS
 ----
 We ignore all SVN properties except svn:executable.  Any unhandled
 properties are logged to $GIT_DIR/svn/<refname>/unhandled.log
 Renamed and copied directories are not detected by Git and hence not
 tracked when committing to SVN.  I do not plan on adding support for
 this as it's quite difficult and time-consuming to get working for all
 the possible corner cases (Git doesn't do it, either).  Committing
 renamed and copied files is fully supported if they're similar enough
 for Git to detect them.
 In SVN, it is possible (though discouraged) to commit changes to a tag
 (because a tag is just a directory copy, thus technically the same as a
 branch). When cloning an SVN repository, 'git svn' cannot know if such a
 commit to a tag will happen in the future. Thus it acts conservatively
 and imports all SVN tags as branches, prefixing the tag name with 'tags/'.
 SEE ALSO
 --------
 linkgit:git-rebase[1]

2

Documentation/git-tag.txt

View File

 @ -134,7 +134,7 @@ options for details.
 --column[=<options>]::
 --no-column::
 	Display tag listing in columns. See configuration variable
 	column.tag for option syntax.`--column` and `--no-column`
 	`column.tag` for option syntax. `--column` and `--no-column`
 	without options are equivalent to 'always' and 'never' respectively.
 +
 This option is only applicable when listing tags without annotation lines.

79

Documentation/git-worktree.txt

View File

 @ -97,8 +97,9 @@ list::
 List details of each working tree.  The main working tree is listed first,
 followed by each of the linked working trees.  The output details include
 whether the working tree is bare, the revision currently checked out, the
 branch currently checked out (or "detached HEAD" if none), and "locked" if
 the worktree is locked.
 branch currently checked out (or "detached HEAD" if none), "locked" if
 the worktree is locked, "prunable" if the worktree can be pruned by `prune`
 command.
 lock::
 @ -143,6 +144,11 @@ locate it. Running `repair` within the recently-moved working tree will
 reestablish the connection. If multiple linked working trees are moved,
 running `repair` from any working tree with each tree's new `<path>` as
 an argument, will reestablish the connection to all the specified paths.
 +
 If both the main working tree and linked working trees have been moved
 manually, then running `repair` in the main working tree and specifying the
 new `<path>` of each linked working tree will reestablish all connections
 in both directions.
 unlock::
 @ -226,9 +232,14 @@ This can also be set up as the default behaviour by using the
 -v::
 --verbose::
 	With `prune`, report all removals.
 +
 With `list`, output additional information about worktrees (see below).
 --expire <time>::
 	With `prune`, only expire unused working trees older than `<time>`.
 +
 With `list`, annotate missing working trees as prunable if they are
 older than `<time>`.
 --reason <string>::
 	With `lock`, an explanation why the working tree is locked.
 @ -367,13 +378,46 @@ $ git worktree list
 /path/to/other-linked-worktree  1234abc  (detached HEAD)
 ------------
 The command also shows annotations for each working tree, according to its state.
 These annotations are:
  * `locked`, if the working tree is locked.
  * `prunable`, if the working tree can be pruned via `git worktree prune`.
 ------------
 $ git worktree list
 /path/to/linked-worktree    abcd1234 [master]
 /path/to/locked-worktreee   acbd5678 (brancha) locked
 /path/to/prunable-worktree  5678abc  (detached HEAD) prunable
 ------------
 For these annotations, a reason might also be available and this can be
 seen using the verbose mode. The annotation is then moved to the next line
 indented followed by the additional information.
 ------------
 $ git worktree list --verbose
 /path/to/linked-worktree              abcd1234 [master]
 /path/to/locked-worktree-no-reason    abcd5678 (detached HEAD) locked
 /path/to/locked-worktree-with-reason  1234abcd (brancha)
 	locked: working tree path is mounted on a portable device
 /path/to/prunable-worktree            5678abc1 (detached HEAD)
 	prunable: gitdir file points to non-existent location
 ------------
 Note that the annotation is moved to the next line if the additional
 information is available, otherwise it stays on the same line as the
 working tree itself.
 Porcelain Format
 ~~~~~~~~~~~~~~~~
 The porcelain format has a line per attribute.  Attributes are listed with a
 label and value separated by a single space.  Boolean attributes (like `bare`
 and `detached`) are listed as a label only, and are present only
 if the value is true.  The first attribute of a working tree is always
 `worktree`, an empty line indicates the end of the record.  For example:
 if the value is true.  Some attributes (like `locked`) can be listed as a label
 only or with a value depending upon whether a reason is available.  The first
 attribute of a working tree is always `worktree`, an empty line indicates the
 end of the record.  For example:
 ------------
 $ git worktree list --porcelain
 @ -388,6 +432,33 @@ worktree /path/to/other-linked-worktree
 HEAD 1234abc1234abc1234abc1234abc1234abc1234a
 detached
 worktree /path/to/linked-worktree-locked-no-reason
 HEAD 5678abc5678abc5678abc5678abc5678abc5678c
 branch refs/heads/locked-no-reason
 locked
 worktree /path/to/linked-worktree-locked-with-reason
 HEAD 3456def3456def3456def3456def3456def3456b
 branch refs/heads/locked-with-reason
 locked reason why is locked
 worktree /path/to/linked-worktree-prunable
 HEAD 1233def1234def1234def1234def1234def1234b
 detached
 prunable gitdir file points to non-existent location
 ------------
 If the lock reason contains "unusual" characters such as newline, they
 are escaped and the entire reason is quoted as explained for the
 configuration variable `core.quotePath` (see linkgit:git-config[1]).
 For Example:
 ------------
 $ git worktree list --porcelain
 ...
 locked "reason\nwhy is locked"
 ...
 ------------
 EXAMPLES

34

Documentation/git.txt

View File

 @ -13,7 +13,7 @@ SYNOPSIS
     [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
     [-p|--paginate|-P|--no-pager] [--no-replace-objects] [--bare]
     [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
     [--super-prefix=<path>]
     [--super-prefix=<path>] [--config-env=<name>=<envvar>]
     <command> [<args>]
 DESCRIPTION
 @ -80,6 +80,28 @@ config file). Including the equals but with an empty value (like `git -c
 foo.bar= ...`) sets `foo.bar` to the empty string which `git config
 --type=bool` will convert to `false`.
 --config-env=<name>=<envvar>::
 	Like `-c <name>=<value>`, give configuration variable
 	'<name>' a value, where <envvar> is the name of an
 	environment variable from which to retrieve the value. Unlike
 	`-c` there is no shortcut for directly setting the value to an
 	empty string, instead the environment variable itself must be
 	set to the empty string.  It is an error if the `<envvar>` does not exist
 	in the environment. `<envvar>` may not contain an equals sign
 	to avoid ambiguity with `<name>` containing one.
 +
 This is useful for cases where you want to pass transitory
 configuration options to git, but are doing so on OS's where
 other processes might be able to read your cmdline
 (e.g. `/proc/self/cmdline`), but not your environ
 (e.g. `/proc/self/environ`). That behavior is the default on
 Linux, but may not be on your system.
 +
 Note that this might add security for variables such as
 `http.extraHeader` where the sensitive information is part of
 the value, but not e.g. `url.<base>.insteadOf` where the
 sensitive information can be part of the key.
 --exec-path[=<path>]::
 	Path to wherever your core Git programs are installed.
 	This can also be controlled by setting the GIT_EXEC_PATH
 @ -648,6 +670,16 @@ for further details.
 	If this environment variable is set to `0`, git will not prompt
 	on the terminal (e.g., when asking for HTTP authentication).
 `GIT_CONFIG_GLOBAL`::
 `GIT_CONFIG_SYSTEM`::
 	Take the configuration from the given files instead from global or
 	system-level configuration files. If `GIT_CONFIG_SYSTEM` is set, the
 	system config file defined at build time (usually `/etc/gitconfig`)
 	will not be read. Likewise, if `GIT_CONFIG_GLOBAL` is set, neither
 	`$HOME/.gitconfig` nor `$XDG_CONFIG_HOME/git/config` will be read. Can
 	be set to `/dev/null` to skip reading configuration files of the
 	respective level.
 `GIT_CONFIG_NOSYSTEM`::
 	Whether to skip reading settings from the system-wide
 	`$(prefix)/etc/gitconfig` file.  This environment variable can

11

Documentation/gitattributes.txt

View File

 @ -845,6 +845,8 @@ patterns are available:
 - `rust` suitable for source code in the Rust language.
 - `scheme` suitable for source code in the Scheme language.
 - `tex` suitable for source code for LaTeX documents.
 @ -1174,7 +1176,8 @@ tag then no replacement will be done.  The placeholders are the same
 as those for the option `--pretty=format:` of linkgit:git-log[1],
 except that they need to be wrapped like this: `$Format:PLACEHOLDERS$`
 in the file.  E.g. the string `$Format:%H$` will be replaced by the
 commit hash.
 commit hash.  However, only one `%(describe)` placeholder is expanded
 per archive to avoid denial-of-service attacks.
 Packing objects
 @ -1244,6 +1247,12 @@ to:
 [attr]binary -diff -merge -text
 ------------
 NOTES
 -----
 Git does not follow symbolic links when accessing a `.gitattributes`
 file in the working tree. This keeps behavior consistent when the file
 is accessed from the index or a tree versus from the filesystem.
 EXAMPLES
 --------

41

Documentation/gitdiffcore.txt

View File

 @ -74,6 +74,7 @@ into another list.  There are currently 5 such transformations:
 - diffcore-merge-broken
 - diffcore-pickaxe
 - diffcore-order
 - diffcore-rotate
 These are applied in sequence.  The set of filepairs 'git diff-{asterisk}'
 commands find are used as the input to diffcore-break, and
 @ -168,6 +169,26 @@ a similarity score different from the default of 50% by giving a
 number after the "-M" or "-C" option (e.g. "-M8" to tell it to use
 /10 = 80%).
 Note that when rename detection is on but both copy and break
 detection are off, rename detection adds a preliminary step that first
 checks if files are moved across directories while keeping their
 filename the same.  If there is a file added to a directory whose
 contents is sufficiently similar to a file with the same name that got
 deleted from a different directory, it will mark them as renames and
 exclude them from the later quadratic step (the one that pairwise
 compares all unmatched files to find the "best" matches, determined by
 the highest content similarity).  So, for example, if a deleted
 docs/ext.txt and an added docs/config/ext.txt are similar enough, they
 will be marked as a rename and prevent an added docs/ext.md that may
 be even more similar to the deleted docs/ext.txt from being considered
 as the rename destination in the later step.  For this reason, the
 preliminary "match same filename" step uses a bit higher threshold to
 mark a file pair as a rename and stop considering other candidates for
 better matches.  At most, one comparison is done per file in this
 preliminary pass; so if there are several remaining ext.txt files
 throughout the directory hierarchy after exact rename detection, this
 preliminary step may be skipped for those files.
 Note.  When the "-C" option is used with `--find-copies-harder`
 option, 'git diff-{asterisk}' commands feed unmodified filepairs to
 diffcore mechanism as well as modified ones.  This lets the copy
 @ -276,6 +297,26 @@ Documentation
 t
 ------------------------------------------------
 diffcore-rotate: For Changing At Which Path Output Starts
 ---------------------------------------------------------
 This transformation takes one pathname, and rotates the set of
 filepairs so that the filepair for the given pathname comes first,
 optionally discarding the paths that come before it.  This is used
 to implement the `--skip-to` and the `--rotate-to` options.  It is
 an error when the specified pathname is not in the set of filepairs,
 but it is not useful to error out when used with "git log" family of
 commands, because it is unreasonable to expect that a given path
 would be modified by each and every commit shown by the "git log"
 command.  For this reason, when used with "git log", the filepair
 that sorts the same as, or the first one that sorts after, the given
 pathname is where the output starts.
 Use of this transformation combined with diffcore-order will produce
 unexpected results, as the input to this transformation is likely
 not sorted when diffcore-order is in effect.
 SEE ALSO
 --------
 linkgit:git-diff[1],

33

Documentation/githooks.txt

View File

 @ -138,7 +138,7 @@ given); `template` (if a `-t` option was given or the
 configuration option `commit.template` is set); `merge` (if the
 commit is a merge or a `.git/MERGE_MSG` file exists); `squash`
 (if a `.git/SQUASH_MSG` file exists); or `commit`, followed by
 a commit SHA-1 (if a `-c`, `-C` or `--amend` option was given).
 a commit object name (if a `-c`, `-C` or `--amend` option was given).
 If the exit status is non-zero, `git commit` will abort.
 @ -231,19 +231,19 @@ named remote is not being used both values will be the same.
 Information about what is to be pushed is provided on the hook's standard
 input with lines of the form:
   <local ref> SP <local sha1> SP <remote ref> SP <remote sha1> LF
   <local ref> SP <local object name> SP <remote ref> SP <remote object name> LF
 For instance, if the command +git push origin master:foreign+ were run the
 hook would receive a line like the following:
   refs/heads/master 67890 refs/heads/foreign 12345
 although the full, 40-character SHA-1s would be supplied.  If the foreign ref
 does not yet exist the `<remote SHA-1>` will be 40 `0`.  If a ref is to be
 deleted, the `<local ref>` will be supplied as `(delete)` and the `<local
 SHA-1>` will be 40 `0`.  If the local commit was specified by something other
 than a name which could be expanded (such as `HEAD~`, or a SHA-1) it will be
 supplied as it was originally given.
 although the full object name would be supplied.  If the foreign ref does not
 yet exist the `<remote object name>` will be the all-zeroes object name.  If a
 ref is to be deleted, the `<local ref>` will be supplied as `(delete)` and the
 `<local object name>` will be the all-zeroes object name.  If the local commit
 was specified by something other than a name which could be expanded (such as
 `HEAD~`, or an object name) it will be supplied as it was originally given.
 If this hook exits with a non-zero status, `git push` will abort without
 pushing anything.  Information about why the push is rejected may be sent
 @ -268,7 +268,7 @@ input a line of the format:
 where `<old-value>` is the old object name stored in the ref,
 `<new-value>` is the new object name to be stored in the ref and
 `<ref-name>` is the full name of the ref.
 When creating a new ref, `<old-value>` is 40 `0`.
 When creating a new ref, `<old-value>` is the all-zeroes object name.
 If the hook exits with non-zero status, none of the refs will be
 updated. If the hook exits with zero, updating of individual refs can
 @ -473,7 +473,8 @@ reference-transaction
 This hook is invoked by any Git command that performs reference
 updates. It executes whenever a reference transaction is prepared,
 committed or aborted and may thus get called multiple times.
 committed or aborted and may thus get called multiple times. The hook
 does not cover symbolic references (but that may change in the future).
 The hook takes exactly one argument, which is the current state the
 given reference transaction is in:
 @ -492,6 +493,14 @@ receives on standard input a line of the format:
   <old-value> SP <new-value> SP <ref-name> LF
 where `<old-value>` is the old object name passed into the reference
 transaction, `<new-value>` is the new object name to be stored in the
 ref and `<ref-name>` is the full name of the ref. When force updating
 the reference regardless of its current value or when the reference is
 to be created anew, `<old-value>` is the all-zeroes object name. To
 distinguish these cases, you can inspect the current value of
 `<ref-name>` via `git rev-parse`.
 The exit status of the hook is ignored for any state except for the
 "prepared" state. In the "prepared" state, a non-zero exit status will
 cause the transaction to be aborted. The hook will not be called with
 @ -550,7 +559,7 @@ command-dependent arguments may be passed in the future.
 The hook receives a list of the rewritten commits on stdin, in the
 format
   <old-sha1> SP <new-sha1> [ SP <extra-info> ] LF
   <old-object-name> SP <new-object-name> [ SP <extra-info> ] LF
 The 'extra-info' is again command-dependent.  If it is empty, the
 preceding SP is also omitted.  Currently, no commands pass any
 @ -566,7 +575,7 @@ rebase::
 	For the 'squash' and 'fixup' operation, all commits that were
 	squashed are listed as being rewritten to the squashed commit.
 	This means that there will be several lines sharing the same
 	'new-sha1'.
 	'new-object-name'.
 +
 The commits are guaranteed to be listed in the order that they were
 processed by rebase.

6

Documentation/gitignore.txt

View File

 @ -149,11 +149,15 @@ not tracked by Git remain untracked.
 To stop tracking a file that is currently tracked, use
 'git rm --cached'.
 Git does not follow symbolic links when accessing a `.gitignore` file in
 the working tree. This keeps behavior consistent when the file is
 accessed from the index or a tree versus from the filesystem.
 EXAMPLES
 --------
  - The pattern `hello.*` matches any file or folder
    whose name begins with `hello`. If one wants to restrict
    whose name begins with `hello.`. If one wants to restrict
    this only to the directory and not in its subdirectories,
    one can prepend the pattern with a slash, i.e. `/hello.*`;
    the pattern now matches `hello.txt`, `hello.c` but not

130

Documentation/gitmailmap.txt Normal file

View File

 @ -0,0 +1,130 @@
 gitmailmap(5)
 =============
 NAME
 ----
 gitmailmap - Map author/committer names and/or E-Mail addresses
 SYNOPSIS
 --------
 $GIT_WORK_TREE/.mailmap
 DESCRIPTION
 -----------
 If the file `.mailmap` exists at the toplevel of the repository, or at
 the location pointed to by the `mailmap.file` or `mailmap.blob`
 configuration options (see linkgit:git-config[1]), it
 is used to map author and committer names and email addresses to
 canonical real names and email addresses.
 SYNTAX
 ------
 The '#' character begins a comment to the end of line, blank lines
 are ignored.
 In the simple form, each line in the file consists of the canonical
 real name of an author, whitespace, and an email address used in the
 commit (enclosed by '<' and '>') to map to the name. For example:
 --
 	Proper Name <commit@email.xx>
 --
 The more complex forms are:
 --
 	<proper@email.xx> <commit@email.xx>
 --
 which allows mailmap to replace only the email part of a commit, and:
 --
 	Proper Name <proper@email.xx> <commit@email.xx>
 --
 which allows mailmap to replace both the name and the email of a
 commit matching the specified commit email address, and:
 --
 	Proper Name <proper@email.xx> Commit Name <commit@email.xx>
 --
 which allows mailmap to replace both the name and the email of a
 commit matching both the specified commit name and email address.
 Both E-Mails and names are matched case-insensitively. For example
 this would also match the 'Commit Name <commit&#64;email.xx>' above:
 --
 	Proper Name <proper@email.xx> CoMmIt NaMe <CoMmIt@EmAiL.xX>
 --
 NOTES
 -----
 Git does not follow symbolic links when accessing a `.mailmap` file in
 the working tree. This keeps behavior consistent when the file is
 accessed from the index or a tree versus from the filesystem.
 EXAMPLES
 --------
 Your history contains commits by two authors, Jane
 and Joe, whose names appear in the repository under several forms:
 ------------
 Joe Developer <joe@example.com>
 Joe R. Developer <joe@example.com>
 Jane Doe <jane@example.com>
 Jane Doe <jane@laptop.(none)>
 Jane D. <jane@desktop.(none)>
 ------------
 Now suppose that Joe wants his middle name initial used, and Jane
 prefers her family name fully spelled out. A `.mailmap` file to
 correct the names would look like:
 ------------
 Joe R. Developer <joe@example.com>
 Jane Doe <jane@example.com>
 Jane Doe <jane@desktop.(none)>
 ------------
 Note that there's no need to map the name for '<jane&#64;laptop.(none)>' to
 only correct the names. However, leaving the obviously broken
 '<jane&#64;laptop.(none)>' and '<jane&#64;desktop.(none)>' E-Mails as-is is
 usually not what you want. A `.mailmap` file which also corrects those
 is:
 ------------
 Joe R. Developer <joe@example.com>
 Jane Doe <jane@example.com> <jane@laptop.(none)>
 Jane Doe <jane@example.com> <jane@desktop.(none)>
 ------------
 Finally, let's say that Joe and Jane shared an E-Mail address, but not
 a name, e.g. by having these two commits in the history generated by a
 bug reporting system. I.e. names appearing in history as:
 ------------
 Joe <bugs@example.com>
 Jane <bugs@example.com>
 ------------
 A full `.mailmap` file which also handles those cases (an addition of
 two lines to the above example) would be:
 ------------
 Joe R. Developer <joe@example.com>
 Jane Doe <jane@example.com> <jane@laptop.(none)>
 Jane Doe <jane@example.com> <jane@desktop.(none)>
 Joe R. Developer <joe@example.com> Joe <bugs@example.com>
 Jane Doe <jane@example.com> Jane <bugs@example.com>
 ------------
 SEE ALSO
 --------
 linkgit:git-check-mailmap[1]
 GIT
 ---
 Part of the linkgit:git[1] suite

8

Documentation/gitmodules.txt

View File

 @ -98,6 +98,14 @@ submodule.<name>.shallow::
 	shallow clone (with a history depth of 1) unless the user explicitly
 	asks for a non-shallow clone.
 NOTES
 -----
 Git does not allow the `.gitmodules` file within a working tree to be a
 symbolic link, and will refuse to check out such a tree entry. This
 keeps behavior consistent when the file is accessed from the index or a
 tree versus from the filesystem, and helps Git reliably enforce security
 checks of the file contents.
 EXAMPLES
 --------

4

Documentation/gitnamespaces.txt

View File

 @ -62,3 +62,7 @@ git clone ext::'git --namespace=foo %s /tmp/prefixed.git'
 ----------
 include::transfer-data-leaks.txt[]
 GIT
 ---
 Part of the linkgit:git[1] suite

11

Documentation/gitweb.conf.txt

View File

 @ -751,6 +751,17 @@ default font sizes or lineheights are changed (e.g. via adding extra
 CSS stylesheet in `@stylesheets`), it may be appropriate to change
 these values.
 email-privacy::
 	Redact e-mail addresses from the generated HTML, etc. content.
 	This obscures e-mail addresses retrieved from the author/committer
 	and comment sections of the Git log.
 	It is meant to hinder web crawlers that harvest and abuse addresses.
 	Such crawlers may not respect robots.txt.
 	Note that users and user tools also see the addresses as redacted.
 	If Gitweb is not the final step in a workflow then subsequent steps
 	may misbehave because of the redacted information they receive.
 	Disabled by default.
 highlight::
 	Server-side syntax highlight support in "blob" view.  It requires
 	`$highlight_bin` program to be available (see the description of

131

Documentation/howto/coordinate-embargoed-releases.txt Normal file

View File

 @ -0,0 +1,131 @@
 Content-type: text/asciidoc
 Abstract: When a critical vulnerability is discovered and fixed, we follow this
  script to coordinate a public release.
 How we coordinate embargoed releases
 ====================================
 To protect Git users from critical vulnerabilities, we do not just release
 fixed versions like regular maintenance releases. Instead, we coordinate
 releases with packagers, keeping the fixes under an embargo until the release
 date. That way, users will have a chance to upgrade on that date, no matter
 what Operating System or distribution they run.
 Open a Security Advisory draft
 ------------------------------
 The first step is to https://github.com/git/git/security/advisories/new[open an
 advisory]. Technically, it is not necessary, but it is convenient and saves a
 bit of hassle. This advisory can also be used to obtain the CVE number and it
 will give us a private fork associated with it that can be used to collaborate
 on a fix.
 Release date of the embargoed version
 -------------------------------------
 If the vulnerability affects Windows users, we want to have our friends over at
 Visual Studio on board. This means we need to target a "Patch Tuesday" (i.e. a
 second Tuesday of the month), at the minimum three weeks from heads-up to
 coordinated release.
 If the vulnerability affects the server side, or can benefit from scans on the
 server side (i.e. if `git fsck` can detect an attack), it is important to give
 all involved Git repository hosting sites enough time to scan all of those
 repositories.
 Notifying the Linux distributions
 ---------------------------------
 At most two weeks before release date, we need to send a notification to
 distros@vs.openwall.org, preferably less than 7 days before the release date.
 This will reach most (all?) Linux distributions. See an example below, and the
 guidelines for this mailing list at
 https://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists[here].
 Once the version has been published, we send a note about that to oss-security.
 As an example, see https://www.openwall.com/lists/oss-security/2019/12/13/1[the
 v2.24.1 mail];
 https://oss-security.openwall.org/wiki/mailing-lists/oss-security[Here] are
 their guidelines.
 The mail to oss-security should also describe the exploit, and give credit to
 the reporter(s): security researchers still receive too little respect for the
 invaluable service they provide, and public credit goes a long way to keep them
 paid by their respective organizations.
 Technically, describing any exploit can be delayed up to 7 days, but we usually
 refrain from doing that, including it right away.
 As a courtesy we typically attach a Git bundle (as `.tar.xz` because the list
 will drop `.bundle` attachments) in the mail to distros@ so that the involved
 parties can take care of integrating/backporting them. This bundle is typically
 created using a command like this:
 	git bundle create cve-xxx.bundle ^origin/master vA.B.C vD.E.F
 	tar cJvf cve-xxx.bundle.tar.xz cve-xxx.bundle
 Example mail to distros@vs.openwall.org
 ---------------------------------------
 ....
 To: distros@vs.openwall.org
 Cc: git-security@googlegroups.com, <other people involved in the report/fix>
 Subject: [vs] Upcoming Git security fix release
 Team,
 The Git project will release new versions on <date> at 10am Pacific Time or
 soon thereafter. I have attached a Git bundle (embedded in a `.tar.xz` to avoid
 it being dropped) which you can fetch into a clone of
 https://github.com/git/git via `git fetch --tags /path/to/cve-xxx.bundle`,
 containing the tags for versions <versions>.
 You can verify with `git tag -v <tag>` that the versions were signed by
 the Git maintainer, using the same GPG key as e.g. v2.24.0.
 Please use these tags to prepare `git` packages for your various
 distributions, using the appropriate tagged versions. The added test cases
 help verify the correctness.
 The addressed issues are:
 <list of CVEs with a short description, typically copy/pasted from Git's
 release notes, usually demo exploit(s), too>
 Credit for finding the vulnerability goes to <reporter>, credit for fixing
 it goes to <developer>.
 Thanks,
 <name>
 ....
 Example mail to oss-security@lists.openwall.com
 -----------------------------------------------
 ....
 To: oss-security@lists.openwall.com
 Cc: git-security@googlegroups.com, <other people involved in the report/fix>
 Subject: git: <copy from security advisory>
 Team,
 The Git project released new versions on <date>, addressing <CVE>.
 All supported platforms are affected in one way or another, and all Git
 versions all the way back to <version> are affected. The fixed versions are:
 <versions>.
 Link to the announcement: <link to lore.kernel.org/git>
 We highly recommend to upgrade.
 The addressed issues are:
 * <list of CVEs and their explanations, along with demo exploits>
 Credit for finding the vulnerability goes to <reporter>, credit for fixing
 it goes to <developer>.
 Thanks,
 <name>
 ....

2

Documentation/i18n.txt

View File

 @ -38,7 +38,7 @@ mind.
   a warning if the commit log message given to it does not look
   like a valid UTF-8 string, unless you explicitly say your
   project uses a legacy encoding.  The way to say this is to
   have i18n.commitencoding in `.git/config` file, like this:
   have `i18n.commitEncoding` in `.git/config` file, like this:
 +
 ------------
 [i18n]

110

Documentation/lint-gitlink.perl

View File

 @ -1,71 +1,67 @@
 #!/usr/bin/perl
 use File::Find;
 use Getopt::Long;
 use strict;
 use warnings;
 my $basedir = ".";
 GetOptions("basedir=s" => \$basedir)
 	or die("Cannot parse command line arguments\n");
 # Parse arguments, a simple state machine for input like:
 #
 # howto/*.txt config/*.txt --section=1 git.txt git-add.txt [...] --to-lint git-add.txt a-file.txt [...]
 my %TXT;
 my %SECTION;
 my $section;
 my $lint_these = 0;
 for my $arg (@ARGV) {
 	if (my ($sec) = $arg =~ /^--section=(\d+)$/s) {
 		$section = $sec;
 		next;
 	}
 my $found_errors = 0;
 	my ($name) = $arg =~ /^(.*?)\.txt$/s;
 	unless (defined $section) {
 		$TXT{$name} = $arg;
 		next;
 	}
 	$SECTION{$name} = $section;
 }
 my $exit_code = 0;
 sub report {
 	my ($where, $what, $error) = @_;
 	print "$where: $error: $what\n";
 	$found_errors = 1;
 	my ($pos, $line, $target, $msg) = @_;
 	substr($line, $pos) = "' <-- HERE";
 	$line =~ s/^\s+//;
 	print "$ARGV:$.: error: $target: $msg, shown with 'HERE' below:\n";
 	print "$ARGV:$.:\t'$line\n";
 	$exit_code = 1;
 }
 sub grab_section {
 	my ($page) = @_;
 	open my $fh, "<", "$basedir/$page.txt";
 	my $firstline = <$fh>;
 	chomp $firstline;
 	close $fh;
 	my ($section) = ($firstline =~ /.*\((\d)\)$/);
 	return $section;
 }
 @ARGV = sort values %TXT;
 die "BUG: Nothing to process!" unless @ARGV;
 while (<>) {
 	my $line = $_;
 	while ($line =~ m/linkgit:((.*?)\[(\d)\])/g) {
 		my $pos = pos $line;
 		my ($target, $page, $section) = ($1, $2, $3);
 sub lint {
 	my ($file) = @_;
 	open my $fh, "<", $file
 		or return;
 	while (<$fh>) {
 		my $where = "$file:$.";
 		while (s/linkgit:((.*?)\[(\d)\])//) {
 			my ($target, $page, $section) = ($1, $2, $3);
 		# De-AsciiDoc
 		$page =~ s/{litdd}/--/g;
 			# De-AsciiDoc
 			$page =~ s/{litdd}/--/g;
 			if ($page !~ /^git/) {
 				report($where, $target, "nongit link");
 				next;
 			}
 			if (! -f "$basedir/$page.txt") {
 				report($where, $target, "no such source");
 				next;
 			}
 			$real_section = grab_section($page);
 			if ($real_section != $section) {
 				report($where, $target,
 					"wrong section (should be $real_section)");
 				next;
 			}
 		if (!exists $TXT{$page}) {
 			report($pos, $line, $target, "link outside of our own docs");
 			next;
 		}
 		if (!exists $SECTION{$page}) {
 			report($pos, $line, $target, "link outside of our sectioned docs");
 			next;
 		}
 		my $real_section = $SECTION{$page};
 		if ($section != $SECTION{$page}) {
 			report($pos, $line, $target, "wrong section (should be $real_section)");
 			next;
 		}
 	}
 	close $fh;
 	# this resets our $. for each file
 	close ARGV if eof;
 }
 sub lint_it {
 	lint($File::Find::name) if -f && /\.txt$/;
 }
 if (!@ARGV) {
 	find({ wanted => \&lint_it, no_chdir => 1 }, $basedir);
 } else {
 	for (@ARGV) {
 		lint($_);
 	}
 }
 exit $found_errors;
 exit $exit_code;

Compare commits

1484 Commits v2.30.8 ... v2.32.0-rc

1 .gitattributes vendored Unescape Escape View File

7 .github/workflows/main.yml vendored Unescape Escape View File

2 .gitignore vendored Unescape Escape View File

1 .mailmap Unescape Escape View File

2 .travis.yml Unescape Escape View File

154 CODE_OF_CONDUCT.md Unescape Escape View File

12 Documentation/CodingGuidelines Unescape Escape View File

26 Documentation/Makefile Unescape Escape View File

2 Documentation/MyFirstContribution.txt Unescape Escape View File

24 Documentation/RelNotes/2.30.3.txt Unescape Escape View File

21 Documentation/RelNotes/2.30.4.txt Unescape Escape View File

12 Documentation/RelNotes/2.30.5.txt Unescape Escape View File

60 Documentation/RelNotes/2.30.6.txt Unescape Escape View File

86 Documentation/RelNotes/2.30.7.txt Unescape Escape View File

52 Documentation/RelNotes/2.30.8.txt Unescape Escape View File

365 Documentation/RelNotes/2.31.0.txt Normal file Unescape Escape View File

27 Documentation/RelNotes/2.31.1.txt Normal file Unescape Escape View File

397 Documentation/RelNotes/2.32.0.txt Normal file Unescape Escape View File

11 Documentation/SubmittingPatches Unescape Escape View File

2 Documentation/blame-options.txt Unescape Escape View File

6 Documentation/config.txt Unescape Escape View File

4 Documentation/config/advice.txt Unescape Escape View File

21 Documentation/config/checkout.txt Unescape Escape View File

4 Documentation/config/clone.txt Unescape Escape View File

6 Documentation/config/commitgraph.txt Unescape Escape View File

2 Documentation/config/core.txt Unescape Escape View File

2 Documentation/config/diff.txt Unescape Escape View File

5 Documentation/config/index.txt Unescape Escape View File

2 Documentation/config/init.txt Unescape Escape View File

5 Documentation/config/log.txt Unescape Escape View File

9 Documentation/config/lsrefs.txt Normal file Unescape Escape View File

5 Documentation/config/maintenance.txt Unescape Escape View File

15 Documentation/config/mergetool.txt Unescape Escape View File

22 Documentation/config/pack.txt Unescape Escape View File

6 Documentation/config/protocol.txt Unescape Escape View File

7 Documentation/config/push.txt Unescape Escape View File

10 Documentation/config/rebase.txt Unescape Escape View File

42 Documentation/config/safe.txt Unescape Escape View File

5 Documentation/config/stash.txt Unescape Escape View File

9 Documentation/config/uploadpack.txt Unescape Escape View File

11 Documentation/date-formats.txt Unescape Escape View File

13 Documentation/diff-generate-patch.txt Unescape Escape View File

71 Documentation/diff-options.txt Unescape Escape View File

9 Documentation/fetch-options.txt Unescape Escape View File

6 Documentation/git-am.txt Unescape Escape View File

11 Documentation/git-apply.txt Unescape Escape View File

2 Documentation/git-blame.txt Unescape Escape View File

6 Documentation/git-branch.txt Unescape Escape View File

67 Documentation/git-cat-file.txt Unescape Escape View File

9 Documentation/git-check-mailmap.txt Unescape Escape View File

7 Documentation/git-clone.txt Unescape Escape View File

59 Documentation/git-commit.txt Unescape Escape View File

21 Documentation/git-config.txt Unescape Escape View File

4 Documentation/git-credential.txt Unescape Escape View File

24 Documentation/git-cvsserver.txt Unescape Escape View File

8 Documentation/git-difftool.txt Unescape Escape View File

8 Documentation/git-for-each-ref.txt Unescape Escape View File

34 Documentation/git-format-patch.txt Unescape Escape View File

14 Documentation/git-gc.txt Unescape Escape View File

64 Documentation/git-grep.txt Unescape Escape View File

10 Documentation/git-http-fetch.txt Unescape Escape View File

25 Documentation/git-index-pack.txt Unescape Escape View File

94 Documentation/git-interpret-trailers.txt Unescape Escape View File

46 Documentation/git-log.txt Unescape Escape View File

8 Documentation/git-ls-files.txt Unescape Escape View File

25 Documentation/git-mailinfo.txt Unescape Escape View File

128 Documentation/git-maintenance.txt Unescape Escape View File

4 Documentation/git-mergetool--lib.txt Unescape Escape View File

4 Documentation/git-mergetool.txt Unescape Escape View File

39 Documentation/git-mktag.txt Unescape Escape View File

14 Documentation/git-multi-pack-index.txt Unescape Escape View File

4 Documentation/git-p4.txt Unescape Escape View File

21 Documentation/git-pack-objects.txt Unescape Escape View File

2 Documentation/git-push.txt Unescape Escape View File

20 Documentation/git-range-diff.txt Unescape Escape View File

55 Documentation/git-rebase.txt Unescape Escape View File

32 Documentation/git-repack.txt Unescape Escape View File

93 Documentation/git-rev-list.txt Unescape Escape View File

1484 Commits

v2.30.8 ... v2.32.0-rc

1

.gitattributes vendored

View File

7

.github/workflows/main.yml vendored

View File

2

.gitignore vendored

View File

1

.mailmap

View File

2

.travis.yml

View File

154

CODE_OF_CONDUCT.md

View File

12

Documentation/CodingGuidelines

View File

26

Documentation/Makefile

View File

2

Documentation/MyFirstContribution.txt

View File

24

Documentation/RelNotes/2.30.3.txt

View File

21

Documentation/RelNotes/2.30.4.txt

View File

12

Documentation/RelNotes/2.30.5.txt

View File

60

Documentation/RelNotes/2.30.6.txt

View File

86

Documentation/RelNotes/2.30.7.txt

View File

52

Documentation/RelNotes/2.30.8.txt

View File

365

Documentation/RelNotes/2.31.0.txt Normal file

View File

27

Documentation/RelNotes/2.31.1.txt Normal file

View File

397

Documentation/RelNotes/2.32.0.txt Normal file

View File

11

Documentation/SubmittingPatches

View File

2

Documentation/blame-options.txt

View File

6

Documentation/config.txt

View File

4

Documentation/config/advice.txt

View File

21

Documentation/config/checkout.txt

View File

4

Documentation/config/clone.txt

View File

6

Documentation/config/commitgraph.txt

View File

2

Documentation/config/core.txt

View File

2

Documentation/config/diff.txt

View File

5

Documentation/config/index.txt

View File

2

Documentation/config/init.txt

View File

5

Documentation/config/log.txt

View File

9

Documentation/config/lsrefs.txt Normal file

View File

5

Documentation/config/maintenance.txt

View File

15

Documentation/config/mergetool.txt

View File

22

Documentation/config/pack.txt

View File

6

Documentation/config/protocol.txt

View File

7

Documentation/config/push.txt

View File

10

Documentation/config/rebase.txt

View File

42

Documentation/config/safe.txt

View File

5

Documentation/config/stash.txt

View File

9

Documentation/config/uploadpack.txt

View File

11

Documentation/date-formats.txt

View File

13

Documentation/diff-generate-patch.txt

View File

71

Documentation/diff-options.txt

View File

9

Documentation/fetch-options.txt

View File

6

Documentation/git-am.txt

View File

11

Documentation/git-apply.txt

View File

2

Documentation/git-blame.txt

View File

6

Documentation/git-branch.txt

View File

67

Documentation/git-cat-file.txt

View File

9

Documentation/git-check-mailmap.txt

View File

7

Documentation/git-clone.txt

View File

59

Documentation/git-commit.txt

View File

21

Documentation/git-config.txt

View File

4

Documentation/git-credential.txt

View File

24

Documentation/git-cvsserver.txt

View File

8

Documentation/git-difftool.txt

View File

8

Documentation/git-for-each-ref.txt

View File

34

Documentation/git-format-patch.txt

View File

14

Documentation/git-gc.txt

View File

64

Documentation/git-grep.txt

View File

10

Documentation/git-http-fetch.txt

View File

25

Documentation/git-index-pack.txt

View File

94

Documentation/git-interpret-trailers.txt

View File

46

Documentation/git-log.txt

View File

8

Documentation/git-ls-files.txt

View File

25

Documentation/git-mailinfo.txt

View File

128

Documentation/git-maintenance.txt

View File

4

Documentation/git-mergetool--lib.txt

View File

4

Documentation/git-mergetool.txt

View File

39

Documentation/git-mktag.txt

View File

14

Documentation/git-multi-pack-index.txt

View File

4

Documentation/git-p4.txt

View File

21

Documentation/git-pack-objects.txt

View File

2

Documentation/git-push.txt

View File

20

Documentation/git-range-diff.txt

View File

55

Documentation/git-rebase.txt

View File

32

Documentation/git-repack.txt

View File

93

Documentation/git-rev-list.txt

View File

74

Documentation/git-rev-parse.txt

View File