Compare commits

..

3416 Commits

Author SHA1 Message Date
e9d7761bb9 Git 2.34.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-24 10:55:13 -08:00
d62a7656fe Merge branch 'jc/save-restore-terminal-revert' into maint
Regression fix for 2.34

* jc/save-restore-terminal-revert:
  Revert "editor: save and reset terminal after calling EDITOR"
2021-11-23 14:48:15 -08:00
eef0a8e7c1 Merge branch 'ds/add-rm-with-sparse-index' into maint
Regression fix for 2.34

* ds/add-rm-with-sparse-index:
  dir: revert "dir: select directories correctly"
2021-11-23 14:48:11 -08:00
bcef4ba329 Merge branch 'ab/update-submitting-patches' into maint
Doc fix.

* ab/update-submitting-patches:
  SubmittingPatches: fix Asciidoc syntax in "GitHub CI" section
2021-11-23 14:48:08 -08:00
ad03180c5c Merge branch 'ev/pull-already-up-to-date-is-noop' into maint
"git pull" with any strategy when the other side is behind us
should succeed as it is a no-op, but doesn't.

* ev/pull-already-up-to-date-is-noop:
  pull: should be noop when already-up-to-date
2021-11-23 14:48:04 -08:00
a650ff5aec Merge branch 'hm/paint-hits-in-log-grep' into maint
"git grep" looking in a blob that has non-UTF8 payload was
completely broken when linked with versions of PCREv2 library older
than 10.34 in the latest release.

* hm/paint-hits-in-log-grep:
  Revert "grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data"
2021-11-23 14:48:00 -08:00
e3f7e01b50 Revert "editor: save and reset terminal after calling EDITOR"
This reverts commit 3d411afabc,
blindly opening /dev/tty and calling tcsetattr() seems to be causing
problems.

cf. https://bugs.eclipse.org/bugs/show_bug.cgi?id=577358
cf. https://lore.kernel.org/git/04ab7301-ea34-476c-eae4-4044fef74b91@gmail.com/

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-22 15:04:20 -08:00
33c5d6c845 dir: revert "dir: select directories correctly"
This reverts commit f6526728f9.

The change in f652672 (dir: select directories correctly, 2021-09-24)
caused a regression in directory-based matches with non-cone-mode
patterns, especially for .gitignore patterns. A test is included to
prevent this regression in the future.

The commit ed495847 (dir: fix pattern matching on dirs, 2021-09-24) was
reverted in 5ceb663 (dir: fix directory-matching bug, 2021-11-02) for
similar reasons. Neither commit changed tests, and tests added later in
the series continue to pass when these commits are reverted.

Reported-by: Danial Alihosseini <danial.alihosseini@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-22 14:53:23 -08:00
e7f3925bed Revert "grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data"
This reverts commit ae39ba431a, as it
breaks "grep" when looking for a string in non UTF-8 haystack, when
linked with certain versions of PCREv2 library.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-19 09:10:27 -08:00
ea1954af77 pull: should be noop when already-up-to-date
The already-up-to-date pull bug was fixed for --ff-only but it did not
include the case where --ff or --ff-only are not specified. This updates
the --ff-only fix to include the case where --ff or --ff-only are not
specified in command line flags or config.

Signed-off-by: Erwin Villejo <erwin.villejo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18 14:38:53 -08:00
cd3e606211 Git 2.34
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-14 22:50:52 -08:00
a288957a40 Merge tag 'l10n-2.34.0-rnd3.1' of git://github.com/git-l10n/git-po
l10n-2.34.0-rnd3.1

* tag 'l10n-2.34.0-rnd3.1' of git://github.com/git-l10n/git-po: (38 commits)
  l10n: pl: 2.34.0 round 3
  l10n: it: fix typos found by git-po-helper
  l10n: ko: fix typos found by git-po-helper
  l10n: Update Catalan translation
  l10n: po-id for 2.34 (round 3)
  l10n: bg.po: Updated Bulgarian translation (5211t)
  l10n: de.po: Update German translation for Git v2.34.0
  l10n: sv.po: Update Swedish translation (5211t0f0)
  l10n: vi(5211t): Translation for v2.34.0 rd3
  l10n: zh_TW.po: v2.34.0 round 3 (0 untranslated)
  l10n: fr: v2.34.0 rnd 3
  l10n: tr: v2.34.0 round 3
  l10n: zh_CN: v2.34.0 round 3
  l10n: git.pot: v2.34.0 round 3 (1 new)
  l10n: pl: 2.34.0 round 2
  l10n: vi(5210t): Translation for v2.34.0 rd2
  l10n: es: 2.34.0 round 2
  l10n: Update Catalan translation
  l10n: bg.po: Updated Bulgarian translation (5210t)
  l10n: fr: v2.34.0 round 2
  ...
2021-11-14 21:45:40 -08:00
cae3877e72 l10n: pl: 2.34.0 round 3
Signed-off-by: Arusekk <arek_koz@o2.pl>
2021-11-14 15:19:23 +01:00
5a0724ad3e l10n: it: fix typos found by git-po-helper
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-11-14 19:40:41 +08:00
edbd9f3715 SubmittingPatches: fix Asciidoc syntax in "GitHub CI" section
A superfluous ']' was added to the title of the GitHub CI section in
f003a91f5c (SubmittingPatches: replace discussion of Travis with GitHub
Actions, 2021-07-22). Remove it.

While at it, format the URL for a GitHub user's workflow runs of Git
between backticks, since if not Asciidoc formats only the first part,
"https://github.com/<Your", as a link, which is not very useful.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-13 23:41:54 -08:00
4e13fcc213 l10n: ko: fix typos found by git-po-helper
When checking typos in file "po/ko.po", "git-po-helper" reports lots of
false positives because there are no spaces between ASCII and Korean
characters. After applied commit adee197 "(dict: add smudge table for
Korean language, 2021-11-11)" of "git-l10n/git-po-helper" to suppress
these false positives, some easy-to-fix typos are found and fixed.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-11-14 10:01:38 +08:00
1293313003 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-11-13 16:35:53 +01:00
1371836157 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.34 (round 3)
2021-11-13 14:42:30 +08:00
d1dad7d765 l10n: po-id for 2.34 (round 3)
- Translate following new components:
    * merge.c
    * rebase-interactive.c
    * rebase.c
    * midx.c
  - Clean up obsolete translations

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-11-13 12:28:29 +07:00
95d85f4884 Merge branch 'master' of github.com:ruester/git-po-de
* 'master' of github.com:ruester/git-po-de:
  l10n: de.po: Update German translation for Git v2.34.0
2021-11-13 09:27:58 +08:00
5a73c6bdc7 Merge branch 'js/trace2-raise-format-version'
When we added a new event type to trace2 event stream, we forgot to
raise the format version number, which has been corrected.

* js/trace2-raise-format-version:
  trace2: increment event format version
2021-11-12 15:29:25 -08:00
2c0fa66bc8 Merge branch 'ab/fsck-unexpected-type'
Regression fix.

* ab/fsck-unexpected-type:
  object-file: free(*contents) only in read_loose_object() caller
  object-file: fix SEGV on free() regression in v2.34.0-rc2
2021-11-12 15:29:25 -08:00
8996d68ac7 Merge branch 'ps/connectivity-optim'
Regression fix.

* ps/connectivity-optim:
  Revert "connected: do not sort input revisions"
2021-11-12 15:29:24 -08:00
5f93836143 l10n: bg.po: Updated Bulgarian translation (5211t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2021-11-12 19:24:16 +01:00
db92cdb5c8 l10n: de.po: Update German translation for Git v2.34.0
Signed-off-by: Matthias Rüster <matthias.ruester@gmail.com>
Reviewed-by: Ralf Thielow <ralf.thielow@gmail.com>
Reviewed-by: Phillip Szelat <phillip.szelat@gmail.com>
2021-11-12 17:01:25 +01:00
04480e67fe trace2: increment event format version
In 64bc752 (trace2: add trace2_child_ready() to report on background
children, 2021-09-20), we added a new "child_ready" event. In
Documentation/technical/api-trace2.txt, we promise that adding a new
event type will result in incrementing the trace2 event format version
number, but this was not done. Correct this in code & docs.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-11 15:01:04 -08:00
f29b823c3a l10n: sv.po: Update Swedish translation (5211t0f0)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-11-11 23:22:48 +01:00
16235e3b14 object-file: free(*contents) only in read_loose_object() caller
In the preceding commit a free() of uninitialized memory regression in
96e41f58fe (fsck: report invalid object type-path combinations,
2021-10-01) was fixed, but we'd still have an issue with leaking
memory from fsck_loose(). Let's fix that issue too.

That issue was introduced in my 31deb28f5e (fsck: don't hard die on
invalid object types, 2021-10-01). It can be reproduced under
SANITIZE=leak with the test I added in 093fffdfbe (fsck tests: add
test for fsck-ing an unknown type, 2021-10-01):

    ./t1450-fsck.sh --run=84 -vixd

In some sense it's not a problem, we lost the same amount of memory in
terms of things malloc'd and not free'd. It just moved from the "still
reachable" to "definitely lost" column in valgrind(1) nomenclature[1],
since we'd have die()'d before.

But now that we don't hard die() anymore in the library let's properly
free() it. Doing so makes this code much easier to follow, since we'll
now have one function owning the freeing of the "contents" variable,
not two.

For context on that memory management pattern the read_loose_object()
function was added in f6371f9210 (sha1_file: add read_loose_object()
function, 2017-01-13) and subsequently used in c68b489e56 (fsck:
parse loose object paths directly, 2017-01-13). The pattern of it
being the task of both sides to free() the memory has been there in
this form since its inception.

1. https://valgrind.org/docs/manual/mc-manual.html#mc-manual.leaks

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-11 13:40:43 -08:00
a7df4f52af Revert "connected: do not sort input revisions"
This reverts commit f45022dc2f,
as this is like breakage in the traversal more likely.  In a
history with 10 single strand of pearls,

   1-->2-->3--...->7-->8-->9-->10

asking "rev-list --unsorted-input 1 10 --not 9 8 7 6 5 4" fails to
paint the bottom 1 uninteresting as the traversal stops, without
completing the propagation of uninteresting bit starting at 4 down
through 3 and 2 to 1.
2021-11-11 12:34:41 -08:00
168a937bbc object-file: fix SEGV on free() regression in v2.34.0-rc2
Fix a regression introduced in my 96e41f58fe (fsck: report invalid
object type-path combinations, 2021-10-01). When fsck-ing blobs larger
than core.bigFileThreshold, we'd free() a pointer to uninitialized
memory.

This issue would have been caught by SANITIZE=address, but since it
involves core.bigFileThreshold, none of the existing tests in our test
suite covered it.

Running them with the "big_file_threshold" in "environment.c" changed
to say "6" would have shown this failure, but let's add a dedicated
test for this scenario based on Han Xin's report[1].

The bug was introduced between v9 and v10[2] of the fsck series merged
in 061a21d36d (Merge branch 'ab/fsck-unexpected-type', 2021-10-25).

1. https://lore.kernel.org/git/20211111030302.75694-1-hanxin.hx@alibaba-inc.com/
2. https://lore.kernel.org/git/cover-v10-00.17-00000000000-20211001T091051Z-avarab@gmail.com/

Reported-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-11 10:41:54 -08:00
143b963b60 l10n: vi(5211t): Translation for v2.34.0 rd3
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2021-11-11 13:19:34 +07:00
2b98abce71 Merge branch 'l10n/zh_TW/211111' of github.com:l10n-tw/git-po
* 'l10n/zh_TW/211111' of github.com:l10n-tw/git-po:
  l10n: zh_TW.po: v2.34.0 round 3 (0 untranslated)
2021-11-11 08:28:26 +08:00
f35f25083e Merge branch 'fr_v2.34.0_rnd3' of github.com:jnavila/git
* 'fr_v2.34.0_rnd3' of github.com:jnavila/git:
  l10n: fr: v2.34.0 rnd 3
2021-11-11 08:27:49 +08:00
1ffc3588fd Merge branch 'tr-2-34-r3' of github.com:bitigchi/git-po
* 'tr-2-34-r3' of github.com:bitigchi/git-po:
  l10n: tr: v2.34.0 round 3
2021-11-11 08:26:54 +08:00
4d53e91c6b A few hotfixes
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-10 15:01:21 -08:00
fe319d5fe1 Merge branch 'jk/ssh-signing-fix'
Reject OpenSSH 8.7 whose "ssh-keygen -Y find-principals" is
unusable from running the ssh signature tests.

* jk/ssh-signing-fix:
  t/lib-gpg: avoid broken versions of ssh-keygen
2021-11-10 15:01:21 -08:00
aace36fd3c Merge branch 'js/simple-ipc-cygwin-socket-fix'
The way Cygwin emulates a unix-domain socket, on top of which the
simple-ipc mechanism is implemented, can race with the program on
the other side that wants to use the socket, and briefly make it
appear as a regular file before lstat(2) starts reporting it as a
socket.  We now have a workaround on the side that connects to a
unix domain socket.

* js/simple-ipc-cygwin-socket-fix:
  simple-ipc: work around issues with Cygwin's Unix socket emulation
2021-11-10 15:01:20 -08:00
c1d16cedd4 Merge branch 'ds/no-usable-cron-on-macos'
"git maintenance run" learned to use system supplied scheduler
backend, but cron on macOS turns out to be unusable for this
purpose.

* ds/no-usable-cron-on-macos:
  maintenance: disable cron on macOS
2021-11-10 15:01:20 -08:00
7c7cf62c48 Merge branch 'jc/fix-pull-ff-only-when-already-up-to-date'
"git pull --ff-only" and "git pull --rebase --ff-only" should make
it a no-op to attempt pulling from a remote that is behind us, but
instead the command errored out by saying it was impossible to
fast-forward, which may technically be true, but not a useful thing
to diagnose as an error.  This has been corrected.

* jc/fix-pull-ff-only-when-already-up-to-date:
  pull: --ff-only should make it a noop when already-up-to-date
2021-11-10 15:01:19 -08:00
569a03f24b l10n: zh_TW.po: v2.34.0 round 3 (0 untranslated)
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2021-11-11 06:43:41 +08:00
ca7a5bf4bd t/lib-gpg: avoid broken versions of ssh-keygen
The "-Y find-principals" option of ssh-keygen seems to be broken in
Debian's openssh-client 1:8.7p1-1, whereas it works fine in 1:8.4p1-5.
This causes several failures for GPGSSH tests. We fulfill the
prerequisite because generating the keys works fine, but actually
verifying a signature causes results ranging from bogus results to
ssh-keygen segfaulting.

We can find the broken version during the prereq check by feeding it
empty input. This should result in it complaining to stderr, but in the
broken version it triggers the segfault, causing the GPGSSH tests to be
skipped.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-10 14:14:37 -08:00
bbb7d710ea l10n: fr: v2.34.0 rnd 3
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-11-10 22:01:57 +01:00
689a2aa719 maintenance: disable cron on macOS
In eba1ba9 (maintenance: `git maintenance run` learned
`--scheduler=<scheduler>`, 2021-09-04), we introduced the ability to
specify a scheduler explicitly. This led to some extra checks around
whether an alternative scheduler was available. This added the
functionality of removing background maintenance from schedulers other
than the one selected.

On macOS, cron is technically available, but running 'crontab' triggers
a UI prompt asking for special permissions. This is the major reason why
launchctl is used as the default scheduler. The is_crontab_available()
method triggers this UI prompt, causing user disruption.

Remove this disruption by using an #ifdef to prevent running crontab
this way on macOS. This has the unfortunate downside that if a user
manually selects cron via the '--scheduler' option, then adjusting the
scheduler later will not remove the schedule from cron. The
'--scheduler' option ignores the is_available checks, which is how we
can get into this situation.

Extract the new check_crontab_process() method to avoid making the
'child' variable unused on macOS. The method is marked MAYBE_UNUSED
because it has no callers on macOS.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-10 11:20:20 -08:00
e9f197e057 l10n: tr: v2.34.0 round 3
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-11-10 21:21:28 +03:00
974ef7ced2 simple-ipc: work around issues with Cygwin's Unix socket emulation
Cygwin emulates Unix sockets by writing files with custom contents and
then marking them as system files.

The tricky problem is that while the file is written and its `system`
bit is set, it is still identified as a file. This caused test failures
when Git is too fast looking for the Unix sockets and then complains
that there is a plain file in the way.

Let's work around this by adding a delayed retry loop, specifically for
Cygwin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Tested-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-10 09:12:19 -08:00
ffa14514a9 l10n: zh_CN: v2.34.0 round 3
Signed-off-by: Fangyi Zhou <me@fangyi.io>
2021-11-10 12:33:51 +00:00
bc9adb42cb Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5210t)
2021-11-10 10:18:44 +08:00
d83443820f l10n: git.pot: v2.34.0 round 3 (1 new)
Generate po/git.pot from v2.34.0-rc2 for git v2.34.0 l10n round 3.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-11-10 08:56:22 +08:00
3a6160031b Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git:
  Git 2.34-rc2
  parse-options.[ch]: revert use of "enum" for parse_options()
  t/lib-git.sh: fix ACL-related permissions failure
  A few fixes before -rc2
  async_die_is_recursing: work around GCC v11.x issue on Fedora
  Document positive variant of commit and merge option "--no-verify"
  pull: honor --no-verify and do not call the commit-msg hook
  http-backend: remove a duplicated code branch
2021-11-10 08:55:14 +08:00
6c220937e2 Git 2.34-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-09 13:19:51 -08:00
84c99b2023 Merge branch 'ab/parse-options-cleanup'
Last minute fix to the update already in 'master'.

* ab/parse-options-cleanup:
  parse-options.[ch]: revert use of "enum" for parse_options()
2021-11-09 13:19:06 -08:00
92dd0a55d0 Merge branch 'ad/ssh-signing-testfix'
Fix ssh-signing test to work on a platform where the default ACL is
overly loose to upset OpenSSH (reported on an installation of Cygwin).

* ad/ssh-signing-testfix:
  t/lib-git.sh: fix ACL-related permissions failure
2021-11-09 13:19:06 -08:00
06a199f38b parse-options.[ch]: revert use of "enum" for parse_options()
Revert the parse_options() prototype change in my recent
352e761388 (parse-options.[ch]: consistently use "enum
parse_opt_result", 2021-10-08) was incorrect. The parse_options()
function returns the number of argc elements that haven't been
processed, not "enum parse_opt_result".

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-09 09:45:37 -08:00
ed7fe7be52 l10n: pl: 2.34.0 round 2
Signed-off-by: Arusekk <arek_koz@o2.pl>
2021-11-09 14:59:51 +01:00
d3600a1ad7 l10n: vi(5210t): Translation for v2.34.0 rd2
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2021-11-08 09:02:58 +07:00
09fb894a5f l10n: es: 2.34.0 round 2
Signed-off-by: Christopher Diaz Riveros <christopher.diaz.riv@gmail.com>
Signed-off-by: Omar Olivares <omar@olivares.cl>
Signed-off-by: Jaime Marquínez Ferrándiz <jaime.marquinez.ferrandiz@fastmail.net>
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
2021-11-07 19:13:01 -05:00
fa800afe59 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-11-06 13:43:08 +01:00
ed5fa68872 l10n: bg.po: Updated Bulgarian translation (5210t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2021-11-06 12:17:35 +01:00
1b372d1ccb Merge branch 'pt-PT' of github.com:git-l10n-pt-PT/git-po
* 'pt-PT' of github.com:git-l10n-pt-PT/git-po:
  l10n: pt_PT: cleaning duplicate translations (#2)
2021-11-06 12:32:08 +08:00
26274f24db Merge branch 'l10n/zh_TW/211104' of github.com:l10n-tw/git-po
* 'l10n/zh_TW/211104' of github.com:l10n-tw/git-po:
  l10n: zh_TW.po: v2.34.0 round 2 (0 untranslated)
2021-11-06 12:16:02 +08:00
7140c4988f t/lib-git.sh: fix ACL-related permissions failure
As well as checking that the relevant functionality is available, the
GPGSSH prerequisite check creates the SSH keys that are used by the test
functions it gates.  If these keys are created in a directory that
has a default Access Control List, the key files can inherit those
permissions.

This can result in a scenario where the private keys are created
successfully, so the prerequisite check passes and the tests are run,
but the key files have permissions that are too permissive, meaning
OpenSSH will refuse to load them and the tests will fail.

To avoid this happening, before creating the keys, clear any default ACL
set on the directory that will contain them.  This step allowed to fail;
if setfacl isn't present, that's a very likely indicator that the
filesystem in question simply doesn't support default ACLs.

Helped-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-05 13:44:37 -07:00
3a7746a66e l10n: fr: v2.34.0 round 2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-11-05 20:41:17 +01:00
e1d1c94364 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.34 (round 2)
2021-11-05 19:48:09 +08:00
3411fa0f0f l10n: po-id for 2.34 (round 2)
Translate following new components:

  * gpg-interface.c
  * send-pack.c
  * fetch-pack.c
  * upload-pack.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-11-05 16:50:31 +07:00
048a41db4d l10n: zh_TW.po: v2.34.0 round 2 (0 untranslated)
Signed-off-by: pan93412 <pan93412@gmail.com>
2021-11-05 12:46:52 +08:00
d42b4ce5a0 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5210t0f0u)
2021-11-05 08:18:57 +08:00
02f2875d3b Merge branch 'fz/po-zh_CN' of github.com:fangyi-zhou/git-po
* 'fz/po-zh_CN' of github.com:fangyi-zhou/git-po:
  l10n: zh_CN: 2.34.0 Round 2
2021-11-05 08:17:50 +08:00
f776897d4f l10n: sv.po: Update Swedish translation (5210t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-11-04 20:44:36 +01:00
88d915a634 A few fixes before -rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-04 12:24:46 -07:00
9cc14a5b5d Sync with maint 2021-11-04 12:24:40 -07:00
5fbd2fc599 Merge branch 'vd/pthread-setspecific-g11-fix' into maint
One CI task based on Fedora image noticed a not-quite-kosher
consturct recently, which has been corrected.

* vd/pthread-setspecific-g11-fix:
  async_die_is_recursing: work around GCC v11.x issue on Fedora
2021-11-04 12:24:20 -07:00
494cb27e57 Merge branch 'ma/doc-git-version' into maint
Typofix.

* ma/doc-git-version:
  git.txt: fix typo
2021-11-04 12:22:10 -07:00
ecb8d9d11e Merge branch 'pw/rebase-r-fixes' into maint
Regression fix.

* pw/rebase-r-fixes:
  rebase -i: fix rewording with --committer-date-is-author-date
2021-11-04 12:20:14 -07:00
99c7db563f Merge branch 'jk/log-warn-on-bogus-encoding' into maint
Squelch over-eager warning message added during this cycle.

* jk/log-warn-on-bogus-encoding:
  log: document --encoding behavior on iconv() failure
  Revert "logmsg_reencode(): warn when iconv() fails"
2021-11-04 12:20:13 -07:00
a73934c320 Merge branch 'vd/pthread-setspecific-g11-fix'
One CI task based on Fedora image noticed a not-quite-kosher
consturct recently, which has been corrected.

* vd/pthread-setspecific-g11-fix:
  async_die_is_recursing: work around GCC v11.x issue on Fedora
2021-11-04 12:07:47 -07:00
ada03fcbeb Merge branch 'rd/http-backend-code-simplification'
Code simplification.

* rd/http-backend-code-simplification:
  http-backend: remove a duplicated code branch
2021-11-04 12:07:46 -07:00
2b647089ba Merge branch 'ar/no-verify-doc'
Doc update.

* ar/no-verify-doc:
  Document positive variant of commit and merge option "--no-verify"
2021-11-04 12:07:46 -07:00
a876f0b95c Merge branch 'ar/fix-git-pull-no-verify'
"git pull --no-verify" did not affect the underlying "git merge".

* ar/fix-git-pull-no-verify:
  pull: honor --no-verify and do not call the commit-msg hook
2021-11-04 12:07:46 -07:00
84bf71eae7 l10n: zh_CN: 2.34.0 Round 2
Signed-off-by: Fangyi Zhou <me@fangyi.io>
2021-11-04 11:02:05 +00:00
e4fa191d02 l10n: tr: v2.34.0 round 2
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-11-04 13:19:12 +03:00
4b540cf913 async_die_is_recursing: work around GCC v11.x issue on Fedora
This fix corrects an issue found in the `dockerized(pedantic, fedora)` CI
build, first appearing after the introduction of a new version of the Fedora
docker image version. This image includes a version of `glibc` with the
attribute `__attr_access_none` added to `pthread_setspecific` [1], the
implementation of which only exists for GCC 11.X - the version included in
the Fedora image. The attribute requires that the pointer provided in the
second argument of `pthread_getspecific` must, if not NULL, be a pointer to
a valid object. In the usage in `async_die_is_recursing`, `(void *)1` is not
valid, causing the error.

This fix imitates a workaround added in SELinux [2] by using the pointer to
the static `async_die_counter` itself as the second argument to
`pthread_setspecific`. This guaranteed non-NULL, valid pointer matches the
intent of the current usage while not triggering the build error.

[1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=a1561c3bbe8
[2] https://lore.kernel.org/all/20211021140519.6593-1-cgzones@googlemail.com/

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03 23:12:10 -07:00
492288c70a l10n: git.pot: v2.34.0 round 2 (3 new, 3 removed)
Generate po/git.pot from v2.34.0-rc1 for git v2.34.0 l10n round 2.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-11-04 08:35:51 +08:00
bbf1932c30 Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git:
  Git 2.34-rc1
  rebase -i: fix rewording with --committer-date-is-author-date
  dir: fix directory-matching bug
  gpg-interface: avoid buffer overrun in parse_ssh_output()
  gpg-interface: handle missing " with " gracefully in parse_ssh_output()
  A few more topics before -rc1
  i18n: fix typos found during l10n for git 2.34.0
  t5310: drop lib-bundle.sh include
  format-patch (doc): clarify --base=auto
  gc: perform incremental repack when implictly enabled
  fsck: verify multi-pack-index when implictly enabled
  fsck: verify commit graph when implicitly enabled
  grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data
  commit-graph: don't consider "replace" objects with "verify"
  commit-graph tests: fix another graph_git_two_modes() helper
  commit-graph tests: fix error-hiding graph_git_two_modes() helper
  pretty: colorize pattern matches in commit messages
  grep: refactor next_match() and match_one_pattern() for external use
2021-11-04 08:34:15 +08:00
66e6babac6 Merge branch 'fz/po-zh_CN' of github.com:fangyi-zhou/git-po
* 'fz/po-zh_CN' of github.com:fangyi-zhou/git-po:
  l10n: zh-CN: v2.34.0 round 1
2021-11-04 08:23:26 +08:00
876b142331 Git 2.34-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03 13:32:40 -07:00
0cb1330bc6 Merge branch 'pw/rebase-r-fixes'
Regression fix.

* pw/rebase-r-fixes:
  rebase -i: fix rewording with --committer-date-is-author-date
2021-11-03 13:32:29 -07:00
36f0a2e20f Merge branch 'ds/add-rm-with-sparse-index'
Regression fix.

* ds/add-rm-with-sparse-index:
  dir: fix directory-matching bug
2021-11-03 13:32:28 -07:00
e2a33ef9e2 Merge branch 'jx/message-fixes'
Fixes to recently added messages.

* jx/message-fixes:
  i18n: fix typos found during l10n for git 2.34.0
2021-11-03 13:32:28 -07:00
e890c845b8 Merge branch 'rs/ssh-signing-fix'
Fixes to recently merged topic.

* rs/ssh-signing-fix:
  gpg-interface: avoid buffer overrun in parse_ssh_output()
  gpg-interface: handle missing " with " gracefully in parse_ssh_output()
2021-11-03 13:32:28 -07:00
9d6b9df128 rebase -i: fix rewording with --committer-date-is-author-date
baf8ec8d3a (rebase -r: don't write .git/MERGE_MSG when
fast-forwarding, 2021-08-20) stopped reading the author script in
run_git_commit() when rewording a commit. This is normally safe
because "git commit --amend" preserves the authorship. However if the
user passes "--committer-date-is-author-date" then we need to read the
author date from the author script when rewording. Fix this regression
by tightening the check for when it is safe to skip reading the author
script.

Reported-by: Jonas Kittner <jonas.kittner@ruhr-uni-bochum.de>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03 10:44:45 -07:00
5ceb663e92 dir: fix directory-matching bug
This reverts the change from ed49584 (dir: fix pattern matching on dirs,
2021-09-24), which claimed to fix a directory-matching problem without a
test case. It turns out to _create_ a bug, but it is a bit subtle.

The bug would have been revealed by the first of two tests being added to
t0008-ignores.sh. The first uses a pattern "/git/" inside the a/.gitignores
file, which matches against 'a/git/foo' but not 'a/git-foo/bar'. This test
would fail before the revert.

The second test shows what happens if the test instead uses a pattern "git/"
and this test passes both before and after the revert.

The difference in these two cases are due to how
last_matching_pattern_from_list() checks patterns both if they have the
PATTERN_FLAG_MUSTBEDIR and PATTERN_FLAG_NODIR flags. In the case of "git/",
the PATTERN_FLAG_NODIR is also provided, making the change in behavior in
match_pathname() not affect the end result of
last_matching_pattern_from_list().

Reported-by: Glen Choo <chooglen@google.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03 10:10:36 -07:00
6b53a80441 l10n: pl: Update translation
Signed-off-by: Arusekk <arek_koz@o2.pl>
2021-11-03 16:05:00 +01:00
7e6630a76d l10n: zh-CN: v2.34.0 round 1
Reviewed-by: 依云 <lilydjwg@gmail.com>
Reviewed-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Fangyi Zhou <me@fangyi.io>
2021-11-02 18:04:10 +00:00
65db97b4fa gpg-interface: avoid buffer overrun in parse_ssh_output()
If the string "key" we found in the output of ssh-keygen happens to be
located at the very end of the line, then going four characters further
leaves us beyond the end of the string.  Explicitly search for the
space after "key" to handle a missing one gracefully.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-01 17:00:41 -07:00
18b18503e3 gpg-interface: handle missing " with " gracefully in parse_ssh_output()
If the output of ssh-keygen starts with "Good \"git\" signature for ",
but is not followed by " with " for some reason, then parse_ssh_output()
uses -1 as the len parameter of xmemdupz(), which in turn will end the
program.  Reject the signature and carry on instead in that case.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-01 17:00:41 -07:00
0cddd84c9f A few more topics before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-01 13:48:08 -07:00
cfd86ee3dd Merge branch 'ab/test-lib'
Test (cosmetic) fix.

* ab/test-lib:
  t5310: drop lib-bundle.sh include
2021-11-01 13:48:08 -07:00
7baf6588c5 Merge branch 'jc/doc-format-patch-clarify-auto-base'
Rephrase the description of "format-patch --base=auto".

* jc/doc-format-patch-clarify-auto-base:
  format-patch (doc): clarify --base=auto
2021-11-01 13:48:08 -07:00
7afb458e91 Merge branch 'gc/use-repo-settings'
It is wrong to read some settings directly from the config
subsystem, as things like feature.experimental can affect their
default values.

* gc/use-repo-settings:
  gc: perform incremental repack when implictly enabled
  fsck: verify multi-pack-index when implictly enabled
  fsck: verify commit graph when implicitly enabled
2021-11-01 13:48:08 -07:00
b82299ec6f Merge branch 'ab/ignore-replace-while-working-on-commit-graph'
Teach "git commit-graph" command not to allow using replace objects
at all, as we do not use the commit-graph at runtime when we see
object replacement.

* ab/ignore-replace-while-working-on-commit-graph:
  commit-graph: don't consider "replace" objects with "verify"
  commit-graph tests: fix another graph_git_two_modes() helper
  commit-graph tests: fix error-hiding graph_git_two_modes() helper
2021-11-01 13:48:08 -07:00
b93d720691 Merge branch 'hm/paint-hits-in-log-grep'
"git log --grep=string --author=name" learns to highlight hits just
like "git grep string" does.

* hm/paint-hits-in-log-grep:
  grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data
  pretty: colorize pattern matches in commit messages
  grep: refactor next_match() and match_one_pattern() for external use
2021-11-01 13:48:08 -07:00
42b297fe8c l10n: pt_PT: cleaning duplicate translations (#2)
* cleaning duplicate incorrect translations part 2
 * update translation table
 * unfuzzy all entries
 * typos check in git commands and git flags
 * random line translations
 * updating all phrases #1

Signed-off-by: Daniel Santos <daniel@brilhante.top>
2021-11-01 15:53:00 +00:00
c0a8212802 l10n: po-id for 2.34 (round 1)
Update following components:
  * add-interactive.c
  * add-patch.c
  * diff.c
  * gc.c
  * builtin/bisect--helper.c
  * builtin/commit.c
  * builtin/fetch.c
  * builtin/merge.c
  * builtin/rebase.c
  * builtin/pull.c
  * builtin/push.c
  * builtin/submodule--helper.c

Translate following new components:
  * apply.c
  * bundle.c
  * git.c
  * git-send-email.perl
  * rerere.c
  * builtin/am.c
  * builtin/apply.c
  * builtin/archive.c
  * builtin/bundle.c
  * builtin/cat-file.c
  * builtin/commit-tree.c
  * builtin/count-objects.c
  * builtin/difftool.c
  * builtin/fast-export.c
  * builtin/fast-import.c
  * builtin/fetch-pack.c
  * builtin/fsck.c
  * builtin/gc.c
  * builtin/hash-object.c
  * builtin/ls-files.c
  * builtin/ls-remote.c
  * builtin/ls-tree.c
  * builtin/read-tree.c
  * builtin/receive-pack.c
  * builtin/reflog.c
  * builtin/repack.c
  * builtin/rev-list.c
  * builtin/rev-parse.c
  * builtin/send-pack.c
  * builtin/symbolic-ref.c
  * builtin/update-index.c
  * builtin/update-ref.c
  * builtin/upload-pack.c
  * builtin/verify-pack.c
  * builtin/verify-commit.c
  * builtin/verify-tag.c
  * builtin/write-tree.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-11-01 17:58:04 +07:00
f733719316 i18n: fix typos found during l10n for git 2.34.0
Emir and Jean-Noël reported typos in some i18n messages when preparing
l10n for git 2.34.0.

* Fix unstable spelling of config variable "gpg.ssh.defaultKeyCommand"
  which was introduced in commit fd9e226776 (ssh signing: retrieve a
  default key from ssh-agent, 2021-09-10).

* Add missing space between "with" and "--python" which was introduced
  in commit bd0708c7eb (ref-filter: add %(raw) atom, 2021-07-26).

* Fix unmatched single quote in 'builtin/index-pack.c' which was
  introduced in commit 8737dab346 (index-pack: refactor renaming in
  final(), 2021-09-09)

[1] https://github.com/git-l10n/git-po/pull/567

Reported-by: Emir Sarı <bitigchi@me.com>
Reported-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-31 22:49:49 -07:00
235233a394 Merge branch 'fr_2.34.0_rnd1' of github.com:jnavila/git
* 'fr_2.34.0_rnd1' of github.com:jnavila/git:
  l10n: fr v2.34.0 rnd1
2021-11-01 09:49:18 +08:00
d01af2568e l10n: fr v2.34.0 rnd1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-10-31 17:48:23 +01:00
5650c37365 l10n: tr: v2.34.0 round 1
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-10-30 21:07:51 +03:00
d00296a8bf l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-10-30 09:54:25 +02:00
1f511a9b56 l10n: git.pot: v2.34.0 round 1 (134 new, 154 removed)
Generate po/git.pot from v2.34.0-rc0 for git v2.34.0 l10n round 1.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-10-30 09:37:04 +08:00
cd9ef9ce67 Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git: (762 commits)
  Git 2.34-rc0
  wrapper: remove xunsetenv()
  log: document --encoding behavior on iconv() failure
  Revert "logmsg_reencode(): warn when iconv() fails"
  completion: fix incorrect bash/zsh string equality check
  add, rm, mv: fix bug that prevents the update of non-sparse dirs
  git-bundle.txt: add missing words and punctuation
  Documentation/Makefile: fix lint-docs mkdir dependency
  submodule: drop unused sm_name parameter from append_fetch_remotes()
  The fifteenth batch
  gitweb.txt: change "folder" to "directory"
  gitignore.txt: change "folder" to "directory"
  git-multi-pack-index.txt: change "folder" to "directory"
  git.txt: fix typo
  archive: describe compression level option
  config.txt: fix typo
  command-list.txt: remove 'sparse-index' from main help
  userdiff-cpp: back out the digit-separators in numbers
  submodule--helper: fix incorrect newlines in an error message
  branch (doc): -m/-c copies config and reflog
  ...
2021-10-30 09:34:30 +08:00
7e27bd589d Git 2.34-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-29 15:43:50 -07:00
9a95a9f230 Merge branch 're/completion-fix-test-equality'
Fix long-standing shell syntax error in the completion script.

* re/completion-fix-test-equality:
  completion: fix incorrect bash/zsh string equality check
2021-10-29 15:43:16 -07:00
68fb83b58e Merge branch 'mt/fix-add-rm-with-sparse-index'
Fix-up to a topic already merged to 'master'.

* mt/fix-add-rm-with-sparse-index:
  add, rm, mv: fix bug that prevents the update of non-sparse dirs
2021-10-29 15:43:16 -07:00
7b3fd03e6a Merge branch 'cm/drop-xunsetenv'
Drop a helper function that has never been used since its addition.

* cm/drop-xunsetenv:
  wrapper: remove xunsetenv()
2021-10-29 15:43:16 -07:00
a31efa77c6 Merge branch 'jk/log-warn-on-bogus-encoding'
Squelch over-eager warning message added during this cycle.

* jk/log-warn-on-bogus-encoding:
  log: document --encoding behavior on iconv() failure
  Revert "logmsg_reencode(): warn when iconv() fails"
2021-10-29 15:43:15 -07:00
4d1ae1a605 Merge branch 'ab/unbundle-progress'
Doc clarification.

* ab/unbundle-progress:
  git-bundle.txt: add missing words and punctuation
2021-10-29 15:43:15 -07:00
2343b75ca0 Merge branch 'jc/branch-copy-doc'
"git branch -c/-m new old" was not described to copy config, which
has been corrected.

* jc/branch-copy-doc:
  branch (doc): -m/-c copies config and reflog
2021-10-29 15:43:15 -07:00
fc0c491f65 Merge branch 'ma/doc-folder-to-directory'
Consistently use 'directory', not 'folder', to call the filesystem
entity that collects a group of files and, eh, directories.

* ma/doc-folder-to-directory:
  gitweb.txt: change "folder" to "directory"
  gitignore.txt: change "folder" to "directory"
  git-multi-pack-index.txt: change "folder" to "directory"
2021-10-29 15:43:15 -07:00
da007ba378 Merge branch 'sg/sparse-index-not-that-common-a-command'
Drop "git sparse-index" from the list of common commands.

* sg/sparse-index-not-that-common-a-command:
  command-list.txt: remove 'sparse-index' from main help
2021-10-29 15:43:15 -07:00
8b3bef88f7 Merge branch 'ma/doc-git-version'
Typofix.

* ma/doc-git-version:
  git.txt: fix typo
2021-10-29 15:43:14 -07:00
cca0a9da05 Merge branch 'js/expand-runtime-prefix'
Typofix.

* js/expand-runtime-prefix:
  config.txt: fix typo
2021-10-29 15:43:14 -07:00
a0f604ee56 Merge branch 'bs/archive-doc-compression-level'
Update "git archive" documentation and give explicit mention on the
compression level for both zip and tar.gz format.

* bs/archive-doc-compression-level:
  archive: describe compression level option
2021-10-29 15:43:14 -07:00
f54c172bb3 Merge branch 'ks/submodule-add-message-fix'
Message regression fix.

* ks/submodule-add-message-fix:
  submodule: drop unused sm_name parameter from append_fetch_remotes()
  submodule--helper: fix incorrect newlines in an error message
2021-10-29 15:43:14 -07:00
dacf0acdf6 Merge branch 'ab/fix-make-lint-docs'
Hotfix for a topic recently merged to 'master'.

* ab/fix-make-lint-docs:
  Documentation/Makefile: fix lint-docs mkdir dependency
2021-10-29 15:43:13 -07:00
942843e8dd Merge branch 'ab/sh-retire-rebase-preserve-merges'
Code clean-up to remove unused helpers.

* ab/sh-retire-rebase-preserve-merges:
  git-sh-setup: remove messaging supporting --preserve-merges
  git-sh-i18n: remove unused eval_ngettext()
2021-10-29 15:43:13 -07:00
192a3fa31d Merge branch 'ab/plug-random-leaks'
Leakfix.

* ab/plug-random-leaks:
  reflog: free() ref given to us by dwim_log()
  submodule--helper: fix small memory leaks
  clone: fix a memory leak of the "git_dir" variable
  grep: fix a "path_list" memory leak
  grep: use object_array_clear() in cmd_grep()
  grep: prefer "struct grep_opt" over its "void *" equivalent
2021-10-29 15:43:13 -07:00
55b99febc4 Merge branch 'ab/plug-handle-path-exclude-leak'
Leakfix.

* ab/plug-handle-path-exclude-leak:
  config.c: don't leak memory in handle_path_include()
2021-10-29 15:43:13 -07:00
c3673a8eb2 Merge branch 'ab/ref-filter-leakfix'
"git for-each-ref" family of commands were leaking the ref_sorting
instances that hold sorting keys specified by the user; this has
been corrected.

* ab/ref-filter-leakfix:
  branch: use ref_sorting_release()
  ref-filter API user: add and use a ref_sorting_release()
  tag: use a "goto cleanup" pattern, leak less memory
2021-10-29 15:43:12 -07:00
735907bde1 Merge branch 'jk/http-push-status-fix'
"git push" client talking to an HTTP server did not diagnose the
lack of the final status report from the other side correctly,
which has been corrected.

* jk/http-push-status-fix:
  transport-helper: recognize "expecting report" error from send-pack
  send-pack: complain about "expecting report" with --helper-status
2021-10-29 15:43:12 -07:00
8ba651b56e Merge branch 'ab/test-bail'
A new feature has been added to abort early in the test framework.

* ab/test-bail:
  test-lib.sh: use "Bail out!" syntax on bad SANITIZE=leak use
  test-lib.sh: de-duplicate error() teardown code
2021-10-29 15:43:12 -07:00
23112fc28c Merge branch 'ab/make-sparse-for-real'
Fix-up for a recent topic.

* ab/make-sparse-for-real:
  Makefile: remove redundant GIT-CFLAGS dependency from "sparse"
2021-10-29 15:43:12 -07:00
9ff67749fb Merge branch 'bs/doc-blame-color-lines'
Doc fix.

* bs/doc-blame-color-lines:
  git config doc: fix recent ASCIIDOC formatting regression
2021-10-29 15:43:12 -07:00
6fc527a8d0 wrapper: remove xunsetenv()
Remove the unused wrapper function.

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-29 14:59:29 -07:00
9e8fe7b1c7 log: document --encoding behavior on iconv() failure
We already note that we may produce invalid output when we skip calling
iconv() altogether. But we may also do so if iconv() fails, and we have
no good alternative. Let's document this to avoid surprising users.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-29 14:35:59 -07:00
0988e665e9 Revert "logmsg_reencode(): warn when iconv() fails"
This reverts commit fd680bc5 (logmsg_reencode(): warn when iconv()
fails, 2021-08-27).  Throwing a warning for each and every commit
that gets reencoded, without allowing a way to squelch, would make
it unpleasant for folks who have to deal with an ancient part of the
history in an old project that used wrong encoding in the commits.
2021-10-29 13:48:58 -07:00
fa21296b58 Document positive variant of commit and merge option "--no-verify"
This documents "--verify" option of the commands. It can be used to re-enable
the hooks disabled by an earlier "--no-verify" in command-line.

Signed-off-by: Alexander Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-29 11:22:56 -07:00
361cb52383 pull: --ff-only should make it a noop when already-up-to-date
Earlier, we made sure that "git pull --ff-only" (and "git -c
pull.ff=only pull") errors out when our current HEAD is not an
ancestor of the tip of the history we are merging, but the condition
to trigger the error was implemented incorrectly.

Imagine you forked from a remote branch, built your history on top
of it, and then attempted to pull from them again.  If they have not
made any update in the meantime, our current HEAD is obviously not
their ancestor, and this new error triggers.

Without the --ff-only option, we just report that there is no need
to pull; we did the same historically with --ff-only, too.

Make sure we do not fail with the recently added check to restore
the historical behaviour.

Reported-by: Kenneth Arnold <ka37@calvin.edu>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-29 00:15:39 -07:00
559664c792 t5310: drop lib-bundle.sh include
Commit ddfe900612 (test-lib-functions: move function to lib-bitmap.sh,
2021-02-09) meant to include lib-bitmap.sh in t5310, but also includes
lib-bundle.sh. Yet we don't use any of its functions, nor have anything
to do with bundles. This is probably just a typo/copy-paste error, as
lib-bundle.sh was added (correctly) to other scripts in the same series.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-29 00:06:12 -07:00
47bfdfb3fd pull: honor --no-verify and do not call the commit-msg hook
The option was incorrectly auto-translated to "--no-verify-signatures",
which causes the unexpected effect of the hook being called.
And an even more unexpected effect of disabling verification of signatures.

The manual page describes the option to behave same as the similarly
named option of "git merge", which seems to be the original intention
of this option in the "pull" command.

Signed-off-by: Alexander Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-28 09:52:09 -07:00
46b0585286 completion: fix incorrect bash/zsh string equality check
In the basic `[`/`test` command, the string equality operator is a
single `=`. The `==` operator is only available in `[[`, which is a
bash-ism also supported by zsh.

This mix-up was causing the following completion error in zsh:
> __git_ls_files_helper:7: = not found

(That message refers to the extraneous symbol in `==` ← `=`).

This updates that comparison to use a single `=` inside the
basic `[ … ]` test conditional.

Although this fix is inconsistent with the other comparisons in this
file, which use `[[ … == … ]]`, and the two expressions are functionally
identical in this context, that approach was rejected due to a
preference for `[`.

Signed-off-by: Robert Estelle <robertestelle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-28 09:33:19 -07:00
20141e322c add, rm, mv: fix bug that prevents the update of non-sparse dirs
These three commands recently learned to avoid updating paths outside
the sparse checkout even if they are missing the SKIP_WORKTREE bit. This
is done using path_in_sparse_checkout(), which checks whether a given
path matches the current list of sparsity rules, similar to what
clear_ce_flags() does when we run "git sparse checkout init" or "git
sparse-checkout reapply". However, clear_ce_flags() uses a recursive
approach, applying the match results from parent directories on paths
that get the UNDECIDED result, whereas path_in_sparse_checkout() only
attempts to match the full path and immediately considers UNDECIDED as
NOT_MATCHED. This makes the function miss matches with leading
directories. For example, if the user has the sparsity patterns "!/a"
and "b/", add, rm, and mv will fail to update the path "a/b/c" and end
up displaying a warning about it being outside the sparse checkout even
though it isn't. This problem only occurs in full pattern mode as the
pattern matching functions never return UNDECIDED for cone mode.

To fix this, replicate the recursive behavior of clear_ce_flags() in
path_in_sparse_checkout(), falling back to the parent directory match
when a path gets the UNDECIDED result. (If this turns out to be too
expensive in some cases, we may want to later add some form of caching
to accelerate multiple queries within the same directory. This is not
implemented in this patch, though.) Also add two tests for each affected
command (add, rm, and mv) to check that they behave correctly with the
recursive pattern matching. The first test would previously fail without
this patch while the second already succeeded. It is added mostly to
make sure that we are not breaking the existing pattern matching for
directories that are really sparse, and also as a protection against any
future regressions.

Two other existing tests had to be changed as well: one test in t3602
checks that "git rm -r <dir>" won't remove sparse entries, but it didn't
allow the non-sparse entries inside <dir> to be removed. The other one,
in t7002, tested that "git mv" would correctly display a warning message
for sparse paths, but it accidentally expected the message to include
two non-sparse paths as well.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-28 08:46:07 -07:00
a4dfb4491e git-bundle.txt: add missing words and punctuation
Add an "and" to separate the two halves of the first sentence of the
paragraph more. Add a comma to similarly separate the two halves of the
second sentence a bit better. Add a period at the end of the paragraph.

Further down in the file, add the missing "be" in "must be accompanied".

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-27 17:06:12 -07:00
4c64fb5aad Documentation/Makefile: fix lint-docs mkdir dependency
Since 8650c6298c (doc lint: make "lint-docs" non-.PHONY, 2021-10-15), we
put the output for gitlink linter into .build/lint-docs/gitlink. There
are order-only dependencies to create the sequence of subdirs like:

  .build/lint-docs: | .build
          $(QUIET)mkdir $@
  .build/lint-docs/gitlink: | .build/lint-docs
          $(QUIET)mkdir $@

where each level has to depend on the prior one (since the parent
directory must exist for us to create something inside it). But the
"howto" and "config" subdirectories of gitlink have the wrong
dependency; they depend on "lint-docs", not "lint-docs/gitlink".

This usually works out, because the LINT_DOCS_GITLINK targets which
depend on "gitlink/howto" also depend on just "gitlink", so the
directory gets created anyway. But since we haven't given make an
explicit ordering, things can racily happen out of order.

If you stick a "sleep 1" in the rule to build "gitlink" like this:

   ## Lint: gitlink
   .build/lint-docs/gitlink: | .build/lint-docs
  -	$(QUIET)mkdir $@
  +	$(QUIET)sleep 1 && mkdir $@

then "make clean; make lint-docs" will fail reliably. Or you can see it
as-is just by building the directory in isolation:

  $ make clean
  [...]
  $ make .build/lint-docs/gitlink/howto
      GEN mergetools-list.made
      GEN cmd-list.made
      GEN doc.dep
      SUBDIR ../
  make[1]: 'GIT-VERSION-FILE' is up to date.
      SUBDIR ../
  make[1]: 'GIT-VERSION-FILE' is up to date.
  mkdir: cannot create directory ‘.build/lint-docs/gitlink/howto’: No such file or directory
  make: *** [Makefile:476: .build/lint-docs/gitlink/howto] Error 1

The fix is easy: we just need to depend on the correct parent directory.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-27 16:57:14 -07:00
6b615dbece submodule: drop unused sm_name parameter from append_fetch_remotes()
Commit c21fb4676f (submodule--helper: fix incorrect newlines in an error
message, 2021-10-23) accidentally added a new, unused parameter while
changing the name and signature of show_fetch_remotes() to
append_fetch_remotes(). We can drop this to keep things simpler (and
satisfy -Wunused-parameter).

The error is likely because c21fb4676f is fixing a problem from
8c8195e9c3 (submodule--helper: introduce add-clone subcommand,
2021-07-10). An earlier iteration of that second commit introduced the
same unused parameter (though it was dropped before it finally made it
to 'next'), and the fix on top accidentally carried forward the extra
parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-27 10:42:11 -07:00
e9e5ba39a7 The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 16:07:12 -07:00
c6fc44e9bf Merge branch 'ab/test-lib-diff-cleanup'
Test clean-up.

* ab/test-lib-diff-cleanup:
  tests: stop using top-level "README" and "COPYING" files
  "lib-diff" tests: make "README" and "COPYING" test data smaller
2021-10-25 16:07:01 -07:00
63ec2297d2 Merge branch 'ab/fix-make-lint-docs'
Build fix.

* ab/fix-make-lint-docs:
  doc lint: make "lint-docs" non-.PHONY
  doc build: speed up "make lint-docs"
  doc lint: emit errors on STDERR
  doc lint: fix error-hiding regression
2021-10-25 16:07:01 -07:00
06355d72dc Merge branch 'ab/pkt-line-cleanup'
Code clean-up.

* ab/pkt-line-cleanup:
  pkt-line.[ch]: remove unused packet_read_line_buf()
  pkt-line.[ch]: remove unused packet_buf_write_len()
2021-10-25 16:07:00 -07:00
d54fd59d84 Merge branch 'tb/fix-midx-rename-while-mapped'
The codepath to write a new version of .midx multi-pack index files
has learned to release the mmaped memory holding the current
version of .midx before removing them from the disk, as some
platforms do not allow removal of a file that still has mapping.

* tb/fix-midx-rename-while-mapped:
  midx.c: guard against commit_lock_file() failures
  midx.c: lookup MIDX by object directory during repack
  midx.c: lookup MIDX by object directory during expire
  midx.c: extract MIDX lookup by object_dir
2021-10-25 16:07:00 -07:00
67f310e1ad Merge branch 'ab/test-cleanly-recreate-trash-directory'
Improve test framework around unwritable directories.

* ab/test-cleanly-recreate-trash-directory:
  test-lib.sh: try to re-chmod & retry on failed trash removal
2021-10-25 16:07:00 -07:00
97ab03b12a Merge branch 'jc/doc-commit-header-continuation-line'
Doc update.

* jc/doc-commit-header-continuation-line:
  signature-format.txt: explain and illustrate multi-line headers
2021-10-25 16:07:00 -07:00
54c4f8ce52 Merge branch 'ab/mark-leak-free-tests-more'
Bunch of tests are marked as "passing leak check".

* ab/mark-leak-free-tests-more:
  merge: add missing strbuf_release()
  ls-files: add missing string_list_clear()
  ls-files: fix a trivial dir_clear() leak
  tests: fix test-oid-array leak, test in SANITIZE=leak
  tests: fix a memory leak in test-oidtree.c
  tests: fix a memory leak in test-parse-options.c
  tests: fix a memory leak in test-prio-queue.c
2021-10-25 16:06:59 -07:00
5a4f8381b6 Merge branch 'ab/mark-leak-free-tests'
Bunch of tests are marked as "passing leak check".

* ab/mark-leak-free-tests:
  leak tests: mark some misc tests as passing with SANITIZE=leak
  leak tests: mark various "generic" tests as passing with SANITIZE=leak
  leak tests: mark some read-tree tests as passing with SANITIZE=leak
  leak tests: mark some ls-files tests as passing with SANITIZE=leak
  leak tests: mark all checkout-index tests as passing with SANITIZE=leak
  leak tests: mark all trace2 tests as passing with SANITIZE=leak
  leak tests: mark all ls-tree tests as passing with SANITIZE=leak
  leak tests: run various "test-tool" tests in t00*.sh SANITIZE=leak
  leak tests: run various built-in tests in t00*.sh SANITIZE=leak
2021-10-25 16:06:59 -07:00
65ca3245f9 Merge branch 'ab/parse-options-cleanup'
Random changes to parse-options implementation.

* ab/parse-options-cleanup:
  parse-options: change OPT_{SHORT,UNSET} to an enum
  parse-options tests: test optname() output
  parse-options.[ch]: make opt{bug,name}() "static"
  commit-graph: stop using optname()
  parse-options.c: move optname() earlier in the file
  parse-options.h: make the "flags" in "struct option" an enum
  parse-options.c: use exhaustive "case" arms for "enum parse_opt_result"
  parse-options.[ch]: consistently use "enum parse_opt_result"
  parse-options.[ch]: consistently use "enum parse_opt_flags"
  parse-options.h: move PARSE_OPT_SHELL_EVAL between enums
2021-10-25 16:06:59 -07:00
f3f157ff27 Merge branch 'js/userdiff-cpp'
Userdiff patterns for the C++ language has been updated.

* js/userdiff-cpp:
  userdiff-cpp: back out the digit-separators in numbers
  userdiff-cpp: learn the C++ spaceship operator
  userdiff-cpp: permit the digit-separating single-quote in numbers
  userdiff-cpp: prepare test cases with yet unsupported features
  userdiff-cpp: tighten word regex
  t4034: add tests showing problematic cpp tokenizations
  t4034/cpp: actually test that operator tokens are not split
2021-10-25 16:06:59 -07:00
6a1bb089fd Merge branch 'da/mergetools-special-case-xxdiff-exit-128'
The xxdiff difftool backend can exit with status 128, which the
difftool-helper that launches the backend takes as a significant
failure, when it is not significant at all.  Work it around.

* da/mergetools-special-case-xxdiff-exit-128:
  mergetools/xxdiff: prevent segfaults from stopping difftool
2021-10-25 16:06:58 -07:00
ef1639145d Merge branch 'fs/ssh-signing-fix'
Fix-up for the other topic already in 'next'.

* fs/ssh-signing-fix:
  gpg-interface: fix leak of strbufs in get_ssh_key_fingerprint()
  gpg-interface: fix leak of "line" in parse_ssh_output()
  ssh signing: clarify trustlevel usage in docs
  ssh signing: fmt-merge-msg tests & config parse
2021-10-25 16:06:58 -07:00
18c6653da0 Merge branch 'fs/ssh-signing'
Use ssh public crypto for object and push-cert signing.

* fs/ssh-signing:
  ssh signing: test that gpg fails for unknown keys
  ssh signing: tests for logs, tags & push certs
  ssh signing: duplicate t7510 tests for commits
  ssh signing: verify signatures using ssh-keygen
  ssh signing: provide a textual signing_key_id
  ssh signing: retrieve a default key from ssh-agent
  ssh signing: add ssh key format and signing code
  ssh signing: add test prereqs
  ssh signing: preliminary refactoring and clean-up
2021-10-25 16:06:58 -07:00
e058b1846c Merge branch 'pw/sparse-cache-tree-verify-fix'
Recent sparse-index addition, namely any use of index_name_pos(),
can expand sparse index entries and breaks any code that walks
cache-tree or existing index entries.  One such instance of such a
breakage has been corrected.

* pw/sparse-cache-tree-verify-fix:
  t1092: run "rebase --apply" without "-q" in testing
  sparse index: fix use-after-free bug in cache_tree_verify()
2021-10-25 16:06:57 -07:00
2c428e4205 Merge branch 'ab/fix-commit-error-message-upon-unwritable-object-store'
"git commit" gave duplicated error message when the object store
was unwritable, which has been corrected.

* ab/fix-commit-error-message-upon-unwritable-object-store:
  commit: fix duplication regression in permission error output
  unwritable tests: assert exact error output
2021-10-25 16:06:57 -07:00
6ffb5fc069 Merge branch 'rs/add-dry-run-without-objects'
Stop "git add --dry-run" from creating new blob and tree objects.

* rs/add-dry-run-without-objects:
  add: don't write objects with --dry-run
2021-10-25 16:06:57 -07:00
525705a0a2 Merge branch 'rs/disable-gc-during-perf-tests'
Avoid performance measurements from getting ruined by gc and other
housekeeping pauses interfering in the middle.

* rs/disable-gc-during-perf-tests:
  perf: disable automatic housekeeping
2021-10-25 16:06:57 -07:00
162a13b855 Merge branch 'jt/no-abuse-alternate-odb-for-submodules'
Follow through the work to use the repo interface to access
submodule objects in-process, instead of abusing the alternate
object database interface.

* jt/no-abuse-alternate-odb-for-submodules:
  submodule: trace adding submodule ODB as alternate
  submodule: pass repo to check_has_commit()
  object-file: only register submodule ODB if needed
  merge-{ort,recursive}: remove add_submodule_odb()
  refs: peeling non-the_repository iterators is BUG
  refs: teach arbitrary repo support to iterators
  refs: plumb repo into ref stores
2021-10-25 16:06:56 -07:00
bfa646c2cb Merge branch 'ab/unpack-trees-leakfix'
Leakfix.

* ab/unpack-trees-leakfix:
  sequencer: fix a memory leak in do_reset()
  sequencer: add a "goto cleanup" to do_reset()
  unpack-trees: don't leak memory in verify_clean_subdirectory()
2021-10-25 16:06:56 -07:00
91016984db Merge branch 'jh/perf-remove-test-times'
Perf test fix.

* jh/perf-remove-test-times:
  t/perf/perf-lib.sh: remove test_times.* at the end test_perf_()
2021-10-25 16:06:56 -07:00
061a21d36d Merge branch 'ab/fsck-unexpected-type'
"git fsck" has been taught to report mismatch between expected and
actual types of an object better.

* ab/fsck-unexpected-type:
  fsck: report invalid object type-path combinations
  fsck: don't hard die on invalid object types
  object-file.c: stop dying in parse_loose_header()
  object-file.c: return ULHR_TOO_LONG on "header too long"
  object-file.c: use "enum" return type for unpack_loose_header()
  object-file.c: simplify unpack_loose_short_header()
  object-file.c: make parse_loose_header_extended() public
  object-file.c: return -1, not "status" from unpack_loose_header()
  object-file.c: don't set "typep" when returning non-zero
  cat-file tests: test for current --allow-unknown-type behavior
  cat-file tests: add corrupt loose object test
  cat-file tests: test for missing/bogus object with -t, -s and -p
  cat-file tests: move bogus_* variable declarations earlier
  fsck tests: test for garbage appended to a loose object
  fsck tests: test current hash/type mismatch behavior
  fsck tests: refactor one test to use a sub-repo
  fsck tests: add test for fsck-ing an unknown type
2021-10-25 16:06:56 -07:00
236bae14da gitweb.txt: change "folder" to "directory"
We prefer "directory" over "folder" when discussing the file system
concept. Change this instance for consistency.

After this, the only hits for '\<folder\>' in Documentation/ relate to
IMAP folders.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 11:06:57 -07:00
c314b62553 gitignore.txt: change "folder" to "directory"
We prefer "directory" over "folder" when discussing the file system
concept. Change this instance for consistency -- indeed, even within
this paragraph, we already use "directory".

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 11:06:56 -07:00
85bc006561 git-multi-pack-index.txt: change "folder" to "directory"
We prefer "directory" over "folder" when discussing the file system
concept. In all of our documentation, these are the only spots where we
refer to the `.git` directory as a folder. Switch to "directory", and
while doing so, add backticks to the ".git" filename to set it in
monospace.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 11:06:56 -07:00
82a57cd13f git.txt: fix typo
Fix the spelling of "internally".

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 10:19:30 -07:00
c4b208c309 archive: describe compression level option
Describe the only <extra> option in `git archive`, that is the compression
level option. Previously this option is only described for zip backend;
add description also for tar backend.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 10:08:23 -07:00
480f0541b8 config.txt: fix typo
Fix the spelling of "substituted".

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 09:12:56 -07:00
6a9a50a8af command-list.txt: remove 'sparse-index' from main help
Ever since 'git sparse-checkout' was introduced [1] it is included in
'git --help' in the section "work on the current change" along with
the commands 'add', 'mv', 'restore', and 'rm'.  It clearly doesn't
belong to that group, moreover it can't be considered such a common
command to belong to 'git --help' in the first place, so remove it
from there.

[1] 94c0956b60 (sparse-checkout: create builtin with 'list'
                subcommand, 2019-11-21)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 09:10:43 -07:00
12144e8f02 http-backend: remove a duplicated code branch
Try to make reading the computation of the gzipped flag a bit more
natural.

Signed-off-by: Robin Dupret <robin.dupret@hey.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 08:56:01 -07:00
386076ec92 userdiff-cpp: back out the digit-separators in numbers
The implementation of digit-separating single-quotes introduced a
note-worthy regression: the change of a character literal with a
digit would splice the digit and the closing single-quote. For
example, the change from 'a' to '2' is now tokenized as
'[-a'-]{+2'+} instead of '[-a-]{+2+}'.

The options to fix the regression are:

- Tighten the regular expression such that the single-quote can only
  occur between digits (that would match the official syntax).

- Remove support for digit separators.

I chose to remove support, because

- I have not seen a lot of code make use of digit separators.

- If code does use digit separators, then the numbers are typically
  long. If a change in one of the segments occurs, it is actually
  better visible if only that segment is highlighted as the word
  that changed instead of the whole long number.

This choice does introduce another minor regression, though, which
is highlighted in the test case: when a change occurs in the second
or later segment of a hexadecimal number where the segment begins
with a digit, but also has letters, the segment is mistaken as
consisting of a number and an identifier. I can live with that.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25 08:47:44 -07:00
c21fb4676f submodule--helper: fix incorrect newlines in an error message
A refactoring[1] done as part of the recent conversion of
'git submodule add' to builtin, changed the error message
shown when a Git directory already exists locally for a submodule
name. Before the refactoring, the error used to appear like so:

  --- START OF OUTPUT ---
  $ git submodule add ../sub/ subm
  A git directory for 'subm' is found locally with remote(s):
    origin        /me/git-repos-for-test/sub
  If you want to reuse this local git directory instead of cloning again from
    /me/git-repos-for-test/sub
  use the '--force' option. If the local git directory is not the correct repo
  or you are unsure what this means choose another name with the '--name' option.
  ---  END OF OUTPUT  ---

After the refactoring the error started appearing like so:

  --- START OF OUTPUT ---
  $ git submodule add ../sub/ subm
  A git directory for 'subm' is found locally with remote(s):  origin     /me/git-repos-for-test/sub
  fatal: If you want to reuse this local git directory instead of cloning again from
  /me/git-repos-for-test/sub
  use the '--force' option. If the local git directory is not the correct repo
  or if you are unsure what this means, choose another name with the '--name' option.

  ---  END OF OUTPUT  ---

As one could observe the remote information is printed along with the
first line rather than on its own line. Also, there's an additional
newline following output.

Make the error message consistent with the error message that used to be
printed before the refactoring.

This also moves the 'fatal:' prefix that appears in the middle of the
error message to the first line as it would more appropriate to have
it in the first line. The output after the change would look like:

  --- START OF OUTPUT ---
  $ git submodule add ../sub/ subm
  fatal: A git directory for 'subm' is found locally with remote(s):
    origin        /me/git-repos-for-test/sub
  If you want to reuse this local git directory instead of cloning again from
    /me/git-repos-for-test/sub
  use the '--force' option. If the local git directory is not the correct repo
  or you are unsure what this means choose another name with the '--name' option.
  ---  END OF OUTPUT  ---

[1]: https://lore.kernel.org/git/20210710074801.19917-5-raykar.ath@gmail.com/#t

Signed-off-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23 23:01:56 -07:00
8252ec300e branch (doc): -m/-c copies config and reflog
The description section for the command mentions config and reflog
are moved or copied by these options, but the description for these
options did not.  Make them match.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23 17:12:41 -07:00
203eb8381a format-patch (doc): clarify --base=auto
What --base=auto tells format-patch is to compute the base commit
itself, using the tracking information.  It does not make anything
track anything.

Tighten the phrasing so that it won't be copied and pasted to other
places.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23 14:33:20 -07:00
3c8150497f reflog: free() ref given to us by dwim_log()
When dwim_log() returns the "ref" is always ether NULL or an
xstrdup()'d string.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23 10:45:25 -07:00
c270b055d9 submodule--helper: fix small memory leaks
Add a missing strbuf_release() and a clear_pathspec() to the
submodule--helper.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23 10:45:25 -07:00
27ff1fbc5d clone: fix a memory leak of the "git_dir" variable
At this point in cmd_clone the "git_dir" is always either an
xstrdup()'d string, or something we got from mkpathdup(). Let's free()
it before we clobber it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23 10:45:25 -07:00
b202e51b15 grep: fix a "path_list" memory leak
Free the "path_list" used in builtin/grep.c, it was declared as
STRING_LIST_INIT_NODUP, let's change it to a STRING_LIST_INIT_DUP
since an early user in cmd_grep() appends a string passed via
parse-options.c to it, which needs to be duplicated.

Let's then convert the remaining callers to use
string_list_append_nodup() instead, allowing us to free the list.

This makes all the tests in t7811-grep-open.sh pass, 6/10 would fail
before this change. The only remaining failure would have been due to
a stray "git checkout" (which still leaks memory). In this case we can
use a "git reset --hard" instead, so let's do that, and move the
test_when_finished() above the code that would modify the relevant
file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23 10:45:25 -07:00
96c101257b grep: use object_array_clear() in cmd_grep()
Free the "struct object_array" before exiting. This makes grep tests
(e.g.  "t7815-grep-binary.sh") a bit happer under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23 10:45:25 -07:00
a2fb7672c0 grep: prefer "struct grep_opt" over its "void *" equivalent
Stylistically fix up code added in bfac23d953 (grep: Fix two memory
leaks, 2010-01-30). We usually don't use the "arg" at all once we've
casted it to the struct we want, let's not do that here when we're
freeing it. Perhaps it was thought that a cast to "void *" would
otherwise be needed?

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23 10:45:25 -07:00
25ad722126 config.c: don't leak memory in handle_path_include()
Fix a memory leak in the error() path in handle_path_include(), this
allows us to run t1305-config-include.sh under SANITIZE=leak,
previously 4 tests there would fail. This fixes up a leak in
9b25a0b52e (config: add include directive, 2012-02-06).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-21 16:26:45 -07:00
8a7a90bc3d Makefile: remove redundant GIT-CFLAGS dependency from "sparse"
The "sparse" target needed the GIT-CFLAGS dependency before my
c234e8a0ec (Makefile: make the "sparse" target non-.PHONY,
2021-09-23), but since then it depends on the corresponding *.o files,
which in turn depend on the correct header files, as well as on
GIT-CFLAGS. There's no need to re-state this dependency here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-21 16:19:52 -07:00
c1e10b2dce git-sh-setup: remove messaging supporting --preserve-merges
Remove messages that were last used by the code removed in
a74b35081c (rebase: drop support for `--preserve-merges`,
2021-09-07).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-21 16:04:29 -07:00
e29099a2d1 git-sh-i18n: remove unused eval_ngettext()
The "eval_ngettext()" function has been orphaned since its last user
was removed in a74b35081c (rebase: drop support for
`--preserve-merges`, 2021-09-07).

See b8fc9e43a7 (i18n: rebase-interactive: mark here-doc strings for
translation, 2016-06-17) for the commit that added these
eval_ngettext() wrappers.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-21 16:04:29 -07:00
d72d4f92e2 branch: use ref_sorting_release()
Use a ref_sorting_release() in branch.c to free the memory from the
ref_sorting_options(). This plugs the final in-tree memory leak of
that API.

In the preceding commit the "sorting" variable was left in the
cmd_branch() scope, even though that wasn't needed anymore. Move it to
the "else if (list)" scope instead. We can also move the "struct
string_list" only used for that branch to be declared in that block

That "struct ref_sorting" does not need to be "static" (and isn't
re-used). The "ref_sorting_options()" will return a valid one, we
don't need to make it "static" to have it zero'd out. That it was
static was another artifact of the pre-image of the preceding commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-20 11:36:13 -07:00
e5fb028688 ref-filter API user: add and use a ref_sorting_release()
Add a ref_sorting_release() and use it for some of the current API
users, the ref_sorting_default() function and its siblings will do a
malloc() which wasn't being free'd previously.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-20 11:36:13 -07:00
37766b61cd tag: use a "goto cleanup" pattern, leak less memory
Change cmd_tag() to free its "struct strbuf"'s instead of using an
UNLEAK() pattern. This changes code added in 886e1084d7 (builtin/:
add UNLEAKs, 2017-10-01).

As shown in the context of the declaration of the "struct
msg_arg" (which I'm changing to use a designated initializer while at
it, and to show the context in this change), that struct is just a
thin wrapper around an int and "struct strbuf".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-20 11:36:13 -07:00
8464b2d1d8 git config doc: fix recent ASCIIDOC formatting regression
Fix a regression in 8c32856133 (blame: document --color-* options,
2021-10-08), which added an extra newline before the "+" syntax.

The "Documentation/doc-diff HEAD~ HEAD" output with this applied is:

    [...]
    @@ -1815,13 +1815,13 @@ CONFIGURATION FILE
                specified colors if the line was introduced before the given
                timestamp, overwriting older timestamped colors.

    -       + Instead of an absolute timestamp relative timestamps work as well,
    -       e.g. 2.weeks.ago is valid to address anything older than 2 weeks.
    +           Instead of an absolute timestamp relative timestamps work as well,
    +           e.g.  2.weeks.ago is valid to address anything older than 2 weeks.

    -       + It defaults to blue,12 month ago,white,1 month ago,red, which colors
    -       everything older than one year blue, recent changes between one month
    -       and one year old are kept white, and lines introduced within the last
    -       month are colored red.
    +           It defaults to blue,12 month ago,white,1 month ago,red, which
    +           colors everything older than one year blue, recent changes between
    +           one month and one year old are kept white, and lines introduced
    +           within the last month are colored red.

            color.blame.repeatedLines
                Use the specified color to colorize line annotations for git blame

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-20 10:55:09 -07:00
9d530dc002 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-18 15:48:10 -07:00
f217f6d1d1 Merge branch 'tz/doc-link-to-bundle-format-fix'
Doc update.

* tz/doc-link-to-bundle-format-fix:
  doc: add bundle-format to TECH_DOCS
2021-10-18 15:47:59 -07:00
ada1a17261 Merge branch 'js/windows-ci-path-fix'
The PATH used in CI job may be too wide and let incompatible dlls
to be grabbed, which can cause the build&test to fail.  Tighten it.

* js/windows-ci-path-fix:
  ci(windows): ensure that we do not pick up random executables
2021-10-18 15:47:58 -07:00
871e42eb09 Merge branch 'bs/doc-blame-color-lines'
The "--color-lines" and "--color-by-age" options of "git blame"
have been missing, which are now documented.

* bs/doc-blame-color-lines:
  blame: document --color-* options
  blame: describe default output format
2021-10-18 15:47:58 -07:00
a86ed75f32 Merge branch 'rs/make-verify-path-really-verify-again'
Recent sparse-index work broke safety against attempts to add paths
with trailing slashes to the index, which has been corrected.

* rs/make-verify-path-really-verify-again:
  read-cache: let verify_path() reject trailing dir separators again
  read-cache: add verify_path_internal()
  t3905: show failure to ignore sub-repo
2021-10-18 15:47:58 -07:00
092228ee5c Merge branch 'jk/cat-file-batch-all-wo-replace'
"git cat-file --batch" with the "--batch-all-objects" option is
supposed to iterate over all the objects found in a repository, but
it used to translate these object names using the replace mechanism,
which defeats the point of enumerating all objects in the repository.
This has been corrected.

* jk/cat-file-batch-all-wo-replace:
  cat-file: use packed_object_info() for --batch-all-objects
  cat-file: split ordered/unordered batch-all-objects callbacks
  cat-file: disable refs/replace with --batch-all-objects
  cat-file: mention --unordered along with --batch-all-objects
  t1006: clean up broken objects
2021-10-18 15:47:57 -07:00
853ec9aa9b Merge branch 'cm/save-restore-terminal'
An editor session launched during a Git operation (e.g. during 'git
commit') can leave the terminal in a funny state.  The code path
has updated to save the terminal state before, and restore it
after, it spawns an editor.

* cm/save-restore-terminal:
  editor: save and reset terminal after calling EDITOR
  terminal: teach git how to save/restore its terminal settings
2021-10-18 15:47:57 -07:00
a4b9fb6a5c Merge branch 'ab/designated-initializers-more'
Code clean-up.

* ab/designated-initializers-more:
  builtin/remote.c: add and use SHOW_INFO_INIT
  builtin/remote.c: add and use a REF_STATES_INIT
  urlmatch.[ch]: add and use URLMATCH_CONFIG_INIT
  builtin/blame.c: refactor commit_info_init() to COMMIT_INFO_INIT macro
  daemon.c: refactor hostinfo_init() to HOSTINFO_INIT macro
2021-10-18 15:47:57 -07:00
0b69bb0fb1 Merge branch 'tb/repack-write-midx'
"git repack" has been taught to generate multi-pack reachability
bitmaps.

* tb/repack-write-midx:
  test-read-midx: fix leak of bitmap_index struct
  builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
  builtin/repack.c: make largest pack preferred
  builtin/repack.c: support writing a MIDX while repacking
  builtin/repack.c: extract showing progress to a variable
  builtin/repack.c: rename variables that deal with non-kept packs
  builtin/repack.c: keep track of existing packs unconditionally
  midx: preliminary support for `--refs-snapshot`
  builtin/multi-pack-index.c: support `--stdin-packs` mode
  midx: expose `write_midx_file_only()` publicly
2021-10-18 15:47:57 -07:00
223a1bfb58 Merge branch 'js/retire-preserve-merges'
The "--preserve-merges" option of "git rebase" has been removed.

* js/retire-preserve-merges:
  sequencer: restrict scope of a formerly public function
  rebase: remove a no-longer-used function
  rebase: stop mentioning the -p option in comments
  rebase: remove obsolete code comment
  rebase: drop the internal `rebase--interactive` command
  git-svn: drop support for `--preserve-merges`
  rebase: drop support for `--preserve-merges`
  pull: remove support for `--rebase=preserve`
  tests: stop testing `git rebase --preserve-merges`
  remote: warn about unhandled branch.<name>.rebase values
  t5520: do not use `pull.rebase=preserve`
2021-10-18 15:47:56 -07:00
0ef08090d2 Merge branch 'rs/mergesort'
The mergesort implementation used to sort linked list has been
optimized.

* rs/mergesort:
  test-mergesort: use repeatable random numbers
  mergesort: use ranks stack
  p0071: test performance of llist_mergesort()
  p0071: measure sorting of already sorted and reversed files
  test-mergesort: add unriffle_skewed mode
  test-mergesort: add unriffle mode
  test-mergesort: add generate subcommand
  test-mergesort: add test subcommand
  test-mergesort: add sort subcommand
  test-mergesort: use strbuf_getline()
2021-10-18 15:47:56 -07:00
c5c3486f38 transport-helper: recognize "expecting report" error from send-pack
When a transport helper pushes via send-pack, it passes --helper-status
to get a machine-readable status back for each ref. The previous commit
taught the send-pack code to hand back "error expecting report" if the
server did not send us the proper ref-status. And that's enough to cause
us to recognize that an error occurred for the ref and print something
sensible in our final status table.

But we do interpret these messages on the remote-helper side to turn
them back into REF_STATUS_* enum values.  Recognizing this token to turn
it back into REF_STATUS_EXPECTING_REPORT has two advantages:

  1. We now print exactly the same message in the human-readable (and
     machine-readable --porcelain) output for this situation whether the
     transport went through a helper (e.g., http) or not (e.g., ssh).

  2. If any code in the helper really cares about distinguishing
     EXPECT_REPORT from more generic error conditions, it could now do
     so. I didn't find any, so this is mostly future-proofing.

So this is mostly cosmetic for now, but it seems like the
least-surprising thing for the transport-helper code to be doing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-18 13:27:36 -07:00
e4c9538a9c send-pack: complain about "expecting report" with --helper-status
When pushing to a server which erroneously omits the final ref-status
report, the client side should complain about the refs for which we
didn't receive the status (because we can't just assume they were
updated). This works over most transports like ssh, but for http we'll
print a very misleading "Everything up-to-date".

It works for ssh because send-pack internally sets the status of each
ref to REF_STATUS_EXPECTING_REPORT, and then if the server doesn't tell
us about a particular ref, it will stay at that value. When we print the
final status table, we'll see that we're still on EXPECTING_REPORT and
complain then.

But for http, we go through remote-curl, which invokes send-pack with
"--stateless-rpc --helper-status". The latter option causes send-pack to
return a machine-readable list of ref statuses to the remote helper. But
ever since its inception in de1a2fdd38 (Smart push over HTTP: client
side, 2009-10-30), the send-pack code has simply omitted mention of any
ref which ended up in EXPECTING_REPORT.

In the remote helper, we then take the absence of any status report
from send-pack to mean that the ref was not even something we tried to
send, and thus it prints "Everything up-to-date". Fortunately it does
detect the eventual non-zero exit from send-pack, and propagates that in
its own non-zero exit code. So at least a careful script invoking "git
push" would notice the failure.  But sending the misleading message on
stderr is certainly confusing for humans (not to mention the
machine-readable "push --porcelain" output, though again, any careful
script should be checking the exit code from push, too).

Nobody seems to have noticed because the server in this instance has to
be misbehaving: it has promised to support the ref-status capability
(otherwise the client will not set EXPECTING_REPORT at all), but didn't
send us any. If the connection were simply cut, then send-pack would
complain about getting EOF while trying to read the status. But if the
server actually sends a flush packet (i.e., saying "now you have all of
the ref statuses" without actually sending any), then the client ends up
in this confused situation.

The fix is simple: we should return an error message from "send-pack
--helper-status", just like we would for any other error per-ref error
condition (in the test I included, the server simply omits all ref
status responses, but a more insidious version of this would skip only
some of them).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-18 13:26:52 -07:00
f3af71c947 gpg-interface: fix leak of strbufs in get_ssh_key_fingerprint()
We read stdout from gpg into a strbuf, then split it into a list of
strbufs, pull out one element, and return it. But we don't free either
the original stdout buffer, nor the list returned from strbuf_split().

This patch fixes both. Note that we have to detach the returned string
from its strbuf before calling strbuf_list_free(), as that would
otherwise throw it away.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-18 13:16:53 -07:00
78d468f1a9 gpg-interface: fix leak of "line" in parse_ssh_output()
We xmemdupz() this buffer, but never free it. Let's do so. We'll use a
cleanup label, since there are multiple exits from the function.

Note that it was also declared a "const char *". We could switch that to
"char *" to indicate that it's allocated, but that make it awkward to
use with skip_prefix(). So instead, we'll introduce an extra non-const
pointer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-18 13:16:51 -07:00
5e311edfd3 t1092: run "rebase --apply" without "-q" in testing
We run a few operations and make sure they produce identical results
with and without sparse-index; the version we merged to the "next"
branch used the "-q" option to work around a breakage caused by a
version used at Microsoft with some unreleased changes, but since
we would want to make sure the commands produce identical results,
including reports given to the output that lists which commits were
picked, use of "-q" loses too much interesting information.

Let's drop "-q" from the command invocation and revisit the issue
when the problematic changes are upstreamed.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-18 09:24:51 -07:00
a897ab7ed1 gc: perform incremental repack when implictly enabled
builtin/gc.c has two ways of checking if multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
  maintenance_task_incremental_repack()

The two implementations have existed since the incremental-repack task
was introduced in e841a79a13 (maintenance: add incremental-repack auto
condition, 2020-09-25). These two values can diverge because
prepare_repo_settings() enables the feature in the_repository->settings
by default.

In the case where core.multiPackIndex is not set in the config, the auto
condition would fail, causing the incremental-repack task to not be
run. Because we always want to consider the default values, we should
always use the_repository->settings.

Standardize on using the_repository->settings.core_multi_pack_index to
check if multi-pack-index is enabled.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 14:30:10 -07:00
dc5570872f fsck: verify multi-pack-index when implictly enabled
Like the previous commit, change fsck to check the
"core_multi_pack_index" variable set in "repo-settings.c" instead of
reading the "core.multiPackIndex" config variable. This fixes a bug
where we wouldn't verify midx if the config key was missing. This bug
was introduced in 18e449f86b (midx: enable core.multiPackIndex by
default, 2020-09-25) where core.multiPackIndex was turned on by default.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 14:30:08 -07:00
f30e4d854b fsck: verify commit graph when implicitly enabled
Change fsck to check the "core_commit_graph" variable set in
"repo-settings.c" instead of reading the "core.commitGraph" variable.
This fixes a bug where we wouldn't verify the commit-graph if the
config key was missing. This bug was introduced in
31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13),
where core.commitGraph was turned on by default.

Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
commit-graph as expected for the 3 values of core.commitGraph. Also,
disable GIT_TEST_COMMIT_GRAPH in t/t0410-partial-clone.sh because some
test cases use fsck in ways that assume that commit-graph checking is
disabled.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 14:30:07 -07:00
ed41385ad6 Merge branch 'ab/ignore-replace-while-working-on-commit-graph' into gc/use-repo-settings
* ab/ignore-replace-while-working-on-commit-graph:
  commit-graph: don't consider "replace" objects with "verify"
  commit-graph tests: fix another graph_git_two_modes() helper
  commit-graph tests: fix error-hiding graph_git_two_modes() helper
2021-10-15 14:30:00 -07:00
ec9a37d69b pkt-line.[ch]: remove unused packet_read_line_buf()
This function was added in 4981fe750b (pkt-line: share
buffer/descriptor reading implementation, 2013-02-23), but in
01f9ec64c8 (Use packet_reader instead of packet_read_line,
2018-12-29) the code that was using it was removed.

Since it's being removed we can in turn remove the "src" and "src_len"
arguments to packet_read(), all the remaining users just passed a
NULL/NULL pair to it.

That function is only a thin wrapper for packet_read_with_status()
which still needs those arguments, but for the thin packet_read()
convenience wrapper we can do away with it for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 13:09:40 -07:00
4f1d3e3e3b pkt-line.[ch]: remove unused packet_buf_write_len()
This function was added in f1f4d8acf4 (pkt-line: add
packet_buf_write_len function, 2018-03-15) for use in
0f1dc53f45 (remote-curl: implement stateless-connect command,
2018-03-15).

In a97d00799a (remote-curl: use post_rpc() for protocol v2 also,
2019-02-21) that only user of it went away, let's remove it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 13:09:05 -07:00
ae22e8415d midx.c: guard against commit_lock_file() failures
When writing a MIDX, we atomically move the new MIDX into place via
commit_lock_file(), but do not check to see if that call was successful.

Make sure that we do check in order to prevent us from incorrectly
reporting that we wrote a new MIDX if we actually encountered an error.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 13:08:11 -07:00
c0f1f9dec4 midx.c: lookup MIDX by object directory during repack
Apply similar treatment as in the last commit to the MIDX `repack`
operation.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 13:08:11 -07:00
98926e0d01 midx.c: lookup MIDX by object directory during expire
Before a new MIDX can be written, expire_midx_packs() first loads the
existing MIDX, figures out which packs can be expired, and then writes a
new MIDX based on that information.

In order to load the existing MIDX, it uses load_multi_pack_index(),
which mmaps the multi-pack-index file, but does not store the resulting
`struct multi_pack_index *` in the object store.

write_midx_internal() also needs to open the existing MIDX, and it does
so by iterating the results of get_multi_pack_index(), so that it reuses
the same pointer held by the object store. But before it can move the
new MIDX into place, it close_object_store() to munmap() the
multi-pack-index file to accommodate platforms like Windows which don't
allow overwriting files which are memory mapped.

That's where things get weird. Since expire_midx_packs has its own
*separate* memory mapped copy of the MIDX, the MIDX file is still memory
mapped! Interestingly, this doesn't seem to cause a problem in our
tests. (I believe that this has much more to do with my own lack of
familiarity with Windows than it does a lack of coverage in our tests).

In any case, we can side-step the whole issue by teaching
expire_midx_packs() to use the `struct multi_pack_index` pointer it
found via the object store instead of maintain its own copy. That way,
when write_midx_internal() calls `close_object_store()`, we know that
there are no memory mapped copies of the MIDX laying around.

A couple of other small notes about this patch:

  - As far as I can tell, passing `local == 1` to the call to
    load_multi_pack_index() was an error, since object_dir could be an
    alternate. But it doesn't matter, since even though we write
    `m->local = 1`, we never read that field back later on.

  - Setting `m = NULL` after write_midx_internal() was likely to prevent
    a double-free back from when that function took a `struct
    multi_pack_index *` that it called close_midx() on itself. We can
    rely on write_midx_internal() to call that for us now.

Finally, this enforces the same "the value of --object-dir must be the
local object store, or an alternate" rule from f57a739691 (midx: avoid
opening multiple MIDXs when writing, 2021-09-01) to the `expire`
sub-command, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 13:08:11 -07:00
504131ac26 midx.c: extract MIDX lookup by object_dir
The first thing that write_midx_internal() does is load the MIDX
corresponding to the given object directory, if one is present.

Prepare for other functions in midx.c to do the same thing by extracting
that operation out to a small helper function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 13:08:11 -07:00
823b4281ca Merge branch 'tb/repack-write-midx' into tb/fix-midx-rename-while-mapped
* tb/repack-write-midx:
  test-read-midx: fix leak of bitmap_index struct
  builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
  builtin/repack.c: make largest pack preferred
  builtin/repack.c: support writing a MIDX while repacking
  builtin/repack.c: extract showing progress to a variable
  builtin/repack.c: rename variables that deal with non-kept packs
  builtin/repack.c: keep track of existing packs unconditionally
  midx: preliminary support for `--refs-snapshot`
  builtin/multi-pack-index.c: support `--stdin-packs` mode
  midx: expose `write_midx_file_only()` publicly
2021-10-15 13:07:37 -07:00
ae39ba431a grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data
If we attempt to grep non-ascii log message text with an ascii pattern, we
run into the following issue:

    $ git log --color --author='.var.*Bjar' -1 origin/master | grep ^Author
    grep: (standard input): binary file matches

So, to fix this teach the grep code to use PCRE2_UTF, as long as the log
output is encoded in UTF-8.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 12:45:39 -07:00
8650c6298c doc lint: make "lint-docs" non-.PHONY
Speed up the "lint-docs" target by making it non-.PHONY. Similar to my
c234e8a0ec (Makefile: make the "sparse" target non-.PHONY,
2021-09-23). We'll now create empty files corresponding to a
dependency graph for each of these lint scripts.

This speeds things up a bit[1], and makes the output correspond to any
in-tree changes we have:

    $ touch git-add.txt; make lint-docs; make lint-docs
        GEN cmd-list.made
        GEN doc.dep
        LINT GITLINK git-add.txt
        LINT MAN END git-add.txt
        LINT MAN SEC git-add.txt
    make: Nothing to be done for 'lint-docs'.

As with the "sparse" target changes this has a hard dependency on the
use of ".DELETE_ON_ERROR" in the Makefile, added here in
db10fc6c09 (doc: simplify Makefile using .DELETE_ON_ERROR,
2021-05-21). This method also depends on the output for us emitting
any errors on STDERR (fixed in a preceding commit), as well us these
scripts exiting with non-zero on any errors (which they were already
doing).

1.
$ git show HEAD~:Documentation/Makefile >Makefile.old
$ hyperfine --warmup 2 -L f ",.old" 'make -j1 -f Makefile{f} lint-docs'
Benchmark #1: make -j1 -f Makefile lint-docs
  Time (mean ± σ):      60.8 ms ±   1.4 ms    [User: 58.7 ms, System: 2.5 ms]
  Range (min … max):    58.9 ms …  64.0 ms    48 runs

Benchmark #2: make -j1 -f Makefile.old lint-docs
  Time (mean ± σ):      84.0 ms ±   1.5 ms    [User: 78.6 ms, System: 5.7 ms]
  Range (min … max):    81.8 ms …  87.8 ms    35 runs

Summary
  'make -j1 -f Makefile lint-docs' ran
    1.38 ± 0.04 times faster than 'make -j1 -f Makefile.old lint-docs'

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 10:29:11 -07:00
8cc804d0ab doc build: speed up "make lint-docs"
Extend the trick we use to speed up the "clean" target to also extend
to the "lint-docs" target. See 54df87555b (Documentation/Makefile:
conditionally include doc.dep, 2020-12-08) for the "clean"
implementation.

The "doc-lint" target only depends on *.txt files, so we don't need to
generate GIT-VERSION-FILE etc. if that's all we're doing. This makes
the "make lint-docs" target more than 2x as fast:

$ git show HEAD~:Documentation/Makefile >Makefile.old
$ hyperfine -L f ",.old" 'make -f Makefile{f} lint-docs'
Benchmark #1: make -f Makefile lint-docs
  Time (mean ± σ):     100.2 ms ±   1.3 ms    [User: 93.7 ms, System: 6.7 ms]
  Range (min … max):    98.4 ms … 103.1 ms    29 runs

Benchmark #2: make -f Makefile.old lint-docs
  Time (mean ± σ):     220.0 ms ±  20.0 ms    [User: 206.0 ms, System: 18.0 ms]
  Range (min … max):   206.6 ms … 267.5 ms    11 runs

Summary
  'make -f Makefile lint-docs' ran
    2.19 ± 0.20 times faster than 'make -f Makefile.old lint-docs'

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 10:20:21 -07:00
f005593dc3 doc lint: emit errors on STDERR
Have all of the scripts invoked by "make check-docs" emit their output
on STDERR. This does not currently matter due to the way we're
invoking them, but will in a subsequent change. It's a good idea to do
this in any case for consistency with other tools we invoke.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 10:16:57 -07:00
7e19e2efa9 doc lint: fix error-hiding regression
Fix the broken "make lint-docs" (or "make check-docs" at the
top-level) target, which has been broken since my cafd9828e8 (doc
lint: lint and fix missing "GIT" end sections, 2021-04-09).

The CI for "seen" is emitting an error about a broken gitlink, but due
to there being 3x scripts chained via ";" instead of "&&" we're not
carrying forward the non-zero exit code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 10:16:36 -07:00
1c720357ce tests: stop using top-level "README" and "COPYING" files
In 459b8d22e5 (tests: do not borrow from COPYING and README from the
real source, 2015-02-15) tests that used "lib-diff.sh" (called
"diff-lib.sh" then) were made to stop relying on the top-level COPYING
file, but we still had other tests that referenced it.

Let's move them over to use the "COPYING_test_data" utility function
introduced in the preceding commit, and in the case of the one test
that needed the "README" file use a ROT 13 version of that "COPYING"
test data. That test added in afd222967c (Extend testing git-mv for
renaming of subdirectories, 2006-07-26) just needs more test data that's not the same as the "COPYING" test data, so a ROT 13 version will do.

This change removes the last references to ../{README,COPYING} in the
test suite.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 09:36:48 -07:00
15b808da74 "lib-diff" tests: make "README" and "COPYING" test data smaller
Follow-up the change in 459b8d22e5 (tests: do not borrow from COPYING
and README from the real source, 2015-02-15) by not shipping a full
copy of older versions of the top-level "COPYING" and "README" files.

The tests that use them just need the small blurb at the top of
"COPYING" as test data, or mock data that's dissimilar. Let's provide
that with a "COPYING_test_data" function instead.

We're not replacing this with some other generic test
data (e.g. "lorum ipsum") because these tests require test file header
to be the old "COPYING" file. See e.g. "t4003-diff-rename-1.sh" which
changes the file, and then does full "test_cmp" comparisons on the
resulting "git diff" output.

This change only changes tests that used the "lib-diff.sh" library,
but splits up what they need into a new "lib-diff-data.sh". A
subsequent commit will change related tests that were missed in
459b8d22e5.

For the test in "t4008-diff-break-rewrite.sh" the "README" file can go
away in favor of echoing the line "some dissimilar content" to a file
in the one test that needed it.

The point of that test is to start with files "A" and "B", and then
have A be more similar to the state of "B" than to its old version (by
copying over the content from the "COPYING" file). Just comparing the
pre-image of "some dissimilar content" and later a munged version of
the "COPYING" output serves that purpose.

While we're at it get rid of a stray "echo $tree" debugging line added
in 15d061b435 ([PATCH] Fix the way diffcore-rename records unremoved
source., 2005-05-27), and stop calling "hash-object" to get the hash
of an object we've just added to the index. We can instead extract
that information from the index itself with "rev-parse".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 09:36:46 -07:00
095d112f8c commit-graph: don't consider "replace" objects with "verify"
Extend the code added in d6538246d3 (commit-graph: not compatible
with replace objects, 2018-08-20) which ignored replace objects in the
"write" command to ignore it in the "verify" command too.

We can just move this assignment to the cmd_commit_graph(), it
dispatches to "write" and "verify", and we're unlikely to ever get a
sub-command that would like to consider replace refs.

This will make tests added in eddc1f556c (mktag tests: test
update-ref and reachable fsck, 2021-06-17) pass in combination with
the "GIT_TEST_COMMIT_GRAPH" mode added in 859fdc0c3c (commit-graph:
define GIT_TEST_COMMIT_GRAPH, 2018-08-29), except that mode is
currently broken (but is being fixed concurrently). See the discussion
starting at [1].

1. https://lore.kernel.org/git/87wnmihswp.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 09:21:30 -07:00
a046aa38ca commit-graph tests: fix another graph_git_two_modes() helper
In 135a712375 (commit-graph: add --split option to builtin,
2019-06-18) this function was copy/pasted to the split commit-graph
tests, as in the preceding commit we need to fix this to use
&&-chaining, so it won't be hiding errors.

Unlike its sister function in "t5318-commit-graph.sh", which we got
lucky with, this one was hiding a real test failure. A tests added in
c523035cbd (commit-graph: allow cross-alternate chains, 2019-06-18)
has never worked as intended. Unlike most other graph_git_behavior
uses in this file it clones the repository into a sub-directory, so
we'll need to refer to "commits/6" as "origin/commits/6".

It's not easy to simply move the "graph_git_behavior" to the test
above it, since it itself spawns a "test_expect_success". Let's
instead add support to "graph_git_behavior()" and
"graph_git_two_modes()" to pass a "-C" argument to git.

We also need to add a "test -d fork" here, because otherwise we'll
fail on e.g.:

    GIT_SKIP_TESTS=t5324.13 ./t5324-split-commit-graph.sh

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 09:21:30 -07:00
3247919a75 commit-graph tests: fix error-hiding graph_git_two_modes() helper
The graph_git_two_modes() helper added in 177722b344 (commit:
integrate commit graph with commit parsing, 2018-04-10) didn't
&&-chain its "git commit-graph" invocations, which as can be seen with
SANITIZE=leak will happily mark tests as passing if both of these
commands die, since test_cmp() will be comparing two empty files.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 09:21:30 -07:00
5d22e18965 test-lib.sh: try to re-chmod & retry on failed trash removal
Try to re-chmod the trash directory on startup if we fail to "rm -rf"
it. This fixes problems where the test leaves the trash directory
behind in a bad permission state for whatever reason.

This fixes an interaction between [1] where t0004-unwritable.sh was
made to use "test_when_finished" for cleanup, and [2] which added the
"--immediate" mode. If a test in this file failed when running with
"--immediate" we wouldn't run the "test_when_finished" block, which
re-chmods the ".git/objects" directory (see [1]).

This can be demonstrated as e.g. (output snipped for less verbosity):

    $ ./t0004-unwritable.sh --run=3 --immediate
    ok 1 # skip setup (--run)
    ok 2 # skip write-tree should notice unwritable repository (--run)
    not ok 3 - commit should notice unwritable repository
    [...]
    $ ./t0004-unwritable.sh --run=3 --immediate
    rm: cannot remove '[...]/trash directory.t0004-unwritable/.git/objects/info': Permission denied
    FATAL: Cannot prepare test area
    [...]

Instead of some version of reverting [1] let's make the test-lib.sh
resilient to this edge-case, it will happen due to [1], but also
e.g. if the relevant "test-lib.sh" process is kill -9'd during the
test run. We should try harder to recover in this case. If we fail to
remove the test directory let's retry after (re-)chmod-ing it.

This doesn't need to be guarded by something that's equivalent to
"POSIXPERM" since if we don't support "chmod" we were about to fail
anyway.

Let's also discard any error output from (a possibly nonexisting)
"chmod", we'll fail on the subsequent "rm -rf" anyway, likewise for
the first "rm -rf" invocation, we don't want to get the "cannot
remove" output if we can get around it with the "chmod", but we do
want any error output from the second "rm -rf", in case that doesn't
fix the issue.

The lack of &&-chaining between the "chmod" and "rm -rf" is
intentional, if we fail the first "rm -rf", can't chmod, but then
succeed the second time around that's what we were hoping for. We just
want to nuke the directory, not carry forward every possible error
code or error message.

1. dbda967684 (t0004 (unwritable files): simplify error handling,
   2010-09-06)
2. b586744a86 (test: skip clean-up when running under --immediate
   mode, 2011-06-27)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 08:54:33 -07:00
f443b226ca Thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-14 09:55:16 -07:00
234383cd40 test-lib.sh: use "Bail out!" syntax on bad SANITIZE=leak use
Improve the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode added in
956d2e4639 (tests: add a test mode for SANITIZE=leak, run it in CI,
2021-09-23) to use a TAP "Bail out!" message when exiting. This will
cause the test run to exit immediately under a TAP consumer like
"prove(1)".

See 614fe01521 (test-lib: bail out when "-v" used under "prove",
2016-10-22) for the initial introduction of "Bail out!" to the
--verbose being amended here.

Before this compiling with "SANITIZE=" and running the tests with
"prove(1)" would cause all the tests to be run to the end (output
trimmed for fewer columns):

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true make
    rm -f -r 'test-results'
    *** prove ***
    t0000-basic.sh ......... Dubious, test returned 1 (wstat 256, 0x100)
    No subtests run
    t0001-init.sh .......... Dubious, test returned 1 (wstat 256, 0x100)
    No subtests run
    [...output where we list every single t[0-9]*.sh file as failing snipped]

Whereas now we'll fail early, like this ("->" line wrapping added):

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true make
    [...]

    t0000-basic.sh ..................................... Bailout called.  Further testing stopped:
    -> GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak
    FAILED--Further testing stopped: GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except
    -> when compiled with SANITIZE=leak
    make: *** [Makefile:53: prove] Error 1

This change also adds a red color to the "Bailout called" line, as
we're now using "say_color error". That improves existing output in
the case of e.g.:

    $ prove -j8 t[0-9]*.sh :: -v
    Bailout called.  Further testing stopped:  verbose mode forbidden under TAP harness; try --verbose-log
    FAILED--Further testing stopped: verbose mode forbidden under TAP harness; try --verbose-log

We don't need to have a "Bail out! " prefix when we're not running
under a TAP consumer (i.e. if test -n "$HARNESS_ACTIVE"), but let's
not make the output conditional on that. Showing it under e.g.:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0095-bloom.sh
    Bail out! GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak

Doesn't harm anything, and I don't think the (small) complexity of
only adding this if we're under "$HARNESS_ACTIVE" is worth it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-14 09:54:54 -07:00
8583bf7559 test-lib.sh: de-duplicate error() teardown code
De-duplicate the "finalize_junit_xml; GIT_EXIT_OK=t; exit 1" code
shared between the "error()" and "--immediate on failure" code paths,
in preparation for adding a third user in a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-14 09:54:54 -07:00
9875c51553 Merge branch 'ja/doc-status-types-and-copies'
A few kinds of changes "git status" can show were not documented.

* ja/doc-status-types-and-copies:
  Documentation/git-status: mention how to detect copies
  Documentation/git-status: document porcelain status T (typechange)
  Documentation/diff-format: state in which cases porcelain status is T
  Documentation/git-status: remove impossible porcelain status DR and DC
2021-10-13 15:15:58 -07:00
f0beebdb7b Merge branch 'ab/make-sparse-for-real'
Prevent "make sparse" from running for the source files that
haven't been modified.

* ab/make-sparse-for-real:
  Makefile: make the "sparse" target non-.PHONY
2021-10-13 15:15:58 -07:00
d7bc852151 Merge branch 'ab/align-parse-options-help'
When "git cmd -h" shows more than one line of usage text (e.g.
the cmd subcommand may take sub-sub-command), parse-options API
learned to align these lines, even across i18n/l10n.

* ab/align-parse-options-help:
  parse-options: properly align continued usage output
  git rev-parse --parseopt tests: add more usagestr tests
  send-pack: properly use parse_options() API for usage string
  parse-options API users: align usage output in C-strings
2021-10-13 15:15:58 -07:00
62f035aee3 Merge branch 'ab/help-config-vars'
Teach "git help -c" into helping the command line completion of
configuration variables.

* ab/help-config-vars:
  help: move column config discovery to help.c library
  help / completion: make "git help" do the hard work
  help tests: test --config-for-completion option & output
  help: simplify by moving to OPT_CMDMODE()
  help: correct logic error in combining --all and --guides
  help: correct logic error in combining --all and --config
  help tests: add test for --config output
  help: correct usage & behavior of "git help --guides"
  help: correct the usage string in -h and documentation
2021-10-13 15:15:58 -07:00
af303ee392 Merge branch 'jh/builtin-fsmonitor-part1'
Built-in fsmonitor (part 1).

* jh/builtin-fsmonitor-part1:
  t/helper/simple-ipc: convert test-simple-ipc to use start_bg_command
  run-command: create start_bg_command
  simple-ipc/ipc-win32: add Windows ACL to named pipe
  simple-ipc/ipc-win32: add trace2 debugging
  simple-ipc: move definition of ipc_active_state outside of ifdef
  simple-ipc: preparations for supporting binary messages.
  trace2: add trace2_child_ready() to report on background children
2021-10-13 15:15:58 -07:00
a5e61a4225 Merge branch 'ab/config-based-hooks-1'
Mostly preliminary clean-up in the hook API.

* ab/config-based-hooks-1:
  hook-list.h: add a generated list of hooks, like config-list.h
  hook.c users: use "hook_exists()" instead of "find_hook()"
  hook.c: add a hook_exists() wrapper and use it in bugreport.c
  hook.[ch]: move find_hook() from run-command.c to hook.c
  Makefile: remove an out-of-date comment
  Makefile: don't perform "mv $@+ $@" dance for $(GENERATED_H)
  Makefile: stop hardcoding {command,config}-list.h
  Makefile: mark "check" target as .PHONY
2021-10-13 15:15:57 -07:00
cf006037bf Merge branch 'ab/lib-subtest'
Updates to the tests in t0000 to test the test framework.

* ab/lib-subtest:
  test-lib tests: get rid of copy/pasted mock test code
  test-lib tests: assert 1 exit code, not non-zero
  test-lib tests: refactor common part of check_sub_test_lib_test*()
  test-lib tests: avoid subshell for "test_cmp" for readability
  test-lib tests: don't provide a description for the sub-tests
  test-lib tests: split up "write and run" into two functions
  test-lib tests: move "run_sub_test" to a new lib-subtest.sh
2021-10-13 15:15:57 -07:00
a7c2daa06d Merge branch 'en/removing-untracked-fixes'
Various fixes in code paths that move untracked files away to make room.

* en/removing-untracked-fixes:
  Documentation: call out commands that nuke untracked files/directories
  Comment important codepaths regarding nuking untracked files/dirs
  unpack-trees: avoid nuking untracked dir in way of locally deleted file
  unpack-trees: avoid nuking untracked dir in way of unmerged file
  Change unpack_trees' 'reset' flag into an enum
  Remove ignored files by default when they are in the way
  unpack-trees: make dir an internal-only struct
  unpack-trees: introduce preserve_ignored to unpack_trees_options
  read-tree, merge-recursive: overwrite ignored files by default
  checkout, read-tree: fix leak of unpack_trees_options.dir
  t2500: add various tests for nuking untracked files
2021-10-13 15:15:57 -07:00
1fdfb774aa Merge branch 'mt/grep-submodule-textconv'
"git grep --recurse-submodules" takes trees and blobs from the
submodule repository, but the textconv settings when processing a
blob from the submodule is not taken from the submodule repository.
A test is added to demonstrate the issue, without fixing it.

* mt/grep-submodule-textconv:
  grep: demonstrate bug with textconv attributes and submodules
2021-10-13 15:15:56 -07:00
2d498a7c89 Merge branch 'ds/add-rm-with-sparse-index'
"git add", "git mv", and "git rm" have been adjusted to avoid
updating paths outside of the sparse-checkout definition unless
the user specifies a "--sparse" option.

* ds/add-rm-with-sparse-index:
  advice: update message to suggest '--sparse'
  mv: refuse to move sparse paths
  rm: skip sparse paths with missing SKIP_WORKTREE
  rm: add --sparse option
  add: update --renormalize to skip sparse paths
  add: update --chmod to skip sparse paths
  add: implement the --sparse option
  add: skip tracked paths outside sparse-checkout cone
  add: fail when adding an untracked sparse file
  dir: fix pattern matching on dirs
  dir: select directories correctly
  t1092: behavior for adding sparse files
  t3705: test that 'sparse_entry' is unstaged
2021-10-13 15:15:56 -07:00
6e4fd8bfcd doc: add bundle-format to TECH_DOCS
A link to the bundle-format was added in 5c8273d57c (bundle doc: rewrite
the "DESCRIPTION" section, 2021-07-31).

Ensure `technical/bundle-format.html` is created to avoid a broken link
in `git-bundle.html`.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-13 11:05:04 -07:00
571f4348dd mergetools/xxdiff: prevent segfaults from stopping difftool
Users often use "git difftool HEAD^" to review their work, and have
"mergetool.prompt" set to false so that difftool does not prompt them
before diffing each file.

This is very convenient because users can see all their diffs by
reviewing the xxdiff windows one at a time.

A problem occurs when xxdiff encounters some binary files.
It can segfault and return exit code 128, which is special-cased
by git-difftool-helper as being an extraordinary situation that
aborts the process.

Suppress the exit code from xxdiff in its diff_cmd() implementation
when we see exit code 128 so that the GIT_EXTERNAL_DIFF loop continues
on uninterrupted to the next file rather than aborting when it
encounters the first binary file.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-13 11:04:04 -07:00
6e658547d3 sequencer: fix a memory leak in do_reset()
Fix a memory leak introduced in 9055e401dd (sequencer: introduce new
commands to reset the revision, 2018-04-25), which called
setup_unpack_trees_porcelain() without a corresponding call to
clear_unpack_trees_porcelain().

This introduces a change in behavior in that we now start calling
clear_unpack_trees_porcelain() even without having called the
setup_unpack_trees_porcelain(). That's OK, that clear function, like
most others, will accept a zero'd out struct.

This inches us closer to passing various tests in
"t34*.sh" (e.g. "t3434-rebase-i18n.sh"), but because they have so many
other memory leaks in revisions.c this doesn't make any test file or
even a single test pass.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-13 10:37:11 -07:00
0c52cf8e00 sequencer: add a "goto cleanup" to do_reset()
Restructure code that's mostly added in 9055e401dd (sequencer:
introduce new commands to reset the revision, 2018-04-25) to avoid
code duplication, and to make freeing other resources easier in a
subsequent commit.

It's safe to initialize "tree_desc" to be zero'd out in order to
unconditionally free desc.buffer, it won't be initialized on the first
couple of "goto"'s.

There are three earlier "return"'s in this function which should
probably be made to use this new "cleanup" too, per [1] it looks like
they're leaving behind stale locks. But let's not try to fix every
potential bug here now, I'm just trying to narrowly plug a memory
leak.

1. https://lore.kernel.org/git/CABPp-BH=3DP-dXRCphY53-3eZd1TU8h5GY_M12nnbEGm-UYB9Q@mail.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-13 10:37:11 -07:00
7491ef6198 ci(windows): ensure that we do not pick up random executables
On the Windows build agents, a lot of programs are installed, and added
to the PATH automatically.

One such program is Git for Windows, and due to the way it is set up,
unfortunately its copy of `gpg.exe` is also reachable via the PATH.

This usually does not pose any problems. To the contrary, it even allows
us to test the GPG parts of Git's test suite even if `gpg.exe` is not
delivered as part of `git-sdk-64-minimal`, the minimal subset of Git for
Windows' SDK that we use in the CI builds to compile Git.

However, every once in a while we build a new MSYS2 runtime, which means
that there is a mismatch between the copy in `git-sdk-64-minimal` and
the copy in C:\Program Files\Git\usr\bin. When that happens we hit the
dreaded problem where only one `msys-2.0.dll` is expected to be in the
PATH, and things start to fail.

Let's avoid all of this by restricting the PATH to the minimal set. This
is actually done by `git-sdk-64-minimal`'s `/etc/profile`, and we just
have to source this file manually (one would expect that it is sourced
automatically, but the Bash steps in Azure Pipelines/GitHub workflows
are explicitly run using `--noprofile`, hence the need for doing this
explicitly).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-13 10:36:17 -07:00
9fb391bff9 ssh signing: clarify trustlevel usage in docs
facca53ac added verification for ssh signatures but incorrectly
described the usage of gpg.minTrustLevel. While the verifications
trustlevel is stil set to fully or undefined depending on if the key is
known or not it has no effect on the verification result. Unknown keys
will always fail verification. This commit updates the docs to match
this behaviour.

Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-13 10:02:27 -07:00
f6c013dfa1 signature-format.txt: explain and illustrate multi-line headers
A signature attached to a signed commit, and the contents of the
commit that merged a signed tag, are both recorded as a value of an
object header field as a multi-line value, and are subject to the
formatting convention for multi-line values in the headers, with a
leading SP signaling that the rest of the line is a continuation of
the previous line.  Most notably, an empty line in such a multi-line
value would result in a line with a sole SP on it.

Examples in the signature-format technical documentation include a
few of these cases but we did not show these otherwise invisible SPs
in the example.  These trailing spaces cannot be seen on display or
on paper, and forces the readers to look for them in their editors
or pagers, even if we added them to the document.

Extend the overview section to explain the multi-line value
formatting and highlight these otherwise invisible SPs by inventing
the "a dollar-sign at the end of line that appears after SP merely
signals that there is a SP there, and the dollar-sign itself does
not appear in the real file" notation, inspired by "cat -e" output,
to help readers to learn exactly where such "a single SP that is
originally an empty line" appears in the examples.

Reported-by: Rob Browning <rlb@defaultvalue.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 19:06:24 -07:00
7ff24785cb leak tests: mark some misc tests as passing with SANITIZE=leak
Mark some tests that match "*{mktree,commit,diff,grep,rm,merge,hunk}*"
as passing when git is compiled with SANITIZE=leak. They'll now be
listed as running under the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test
mode (the "linux-leaks" CI target).

These were picked because we still have a lot of failures in adjacent
areas, and we didn't have much if any coverage of e.g. grep and diff
before this change, we could still whitelist a lot more tests, but
let's stop for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 18:23:24 -07:00
288a480621 leak tests: mark various "generic" tests as passing with SANITIZE=leak
Mark various "generic" tests as passing when git is compiled with
SANITIZE=leak. These tests were subjectively picked from the lists of
passing tests since they're all small, and test some generic feature
such as wildmatch(), commonly used environment variables, ident
parsing etc.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 18:23:24 -07:00
98300c8157 leak tests: mark some read-tree tests as passing with SANITIZE=leak
Mark some tests that match "*read-tree*" as passing when git is
compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target). We still have around half the tests that match
"*read-tree*" failing, but let's whitelist those that don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 18:23:24 -07:00
0b3481c9ab leak tests: mark some ls-files tests as passing with SANITIZE=leak
Mark some tests that match "*ls-files*" as passing when git is
compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target). We still have others that match '*ls-files*" that fail
under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 18:23:24 -07:00
b7bcdbdb48 leak tests: mark all checkout-index tests as passing with SANITIZE=leak
Mark some tests that match "*{checkout,switch}*" as passing when git
is compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target).

Unfortunately almost all of those tests fail when compiled with
SANITIZE=leak, these only pass because they run "checkout-index", not
the main "checkout" command.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 18:23:24 -07:00
0033ab3f59 leak tests: mark all trace2 tests as passing with SANITIZE=leak
Mark all tests that match "*trace2*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 18:23:24 -07:00
809aeedb0e leak tests: mark all ls-tree tests as passing with SANITIZE=leak
Mark those tests that match "*ls-tree*" as passing when git is
compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 18:23:24 -07:00
fdc8f79f1f leak tests: run various "test-tool" tests in t00*.sh SANITIZE=leak
Mark various existing tests in t00*.sh that invoke a "test-tool" with
as passing when git is compiled with SANITIZE=leak.

They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 18:23:24 -07:00
c150064dbe leak tests: run various built-in tests in t00*.sh SANITIZE=leak
Mark various existing tests in t00*.sh that invoke git built-ins with
TEST_PASSES_SANITIZE_LEAK=true as passing when git is compiled with
SANITIZE=leak.

They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 18:23:24 -07:00
2bd2f258f4 Sync with Git 2.33.1 2021-10-12 13:59:32 -07:00
af6d1d602a Git 2.33.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 13:51:59 -07:00
9e25a2e85a Merge branch 'ah/connect-parse-feature-v0-fix' into maint
Protocol v0 clients can get stuck parsing a malformed feature line.

* ah/connect-parse-feature-v0-fix:
  connect: also update offset for features without values
2021-10-12 13:51:49 -07:00
bbfc8212e1 Merge branch 'ab/make-clean-depend-dirs' into maint
"make clean" has been updated to remove leftover .depend/
directories, even when it is not told to use them to compute header
dependencies.

* ab/make-clean-depend-dirs:
  Makefile: clean .depend dirs under COMPUTE_HEADER_DEPENDENCIES != yes
2021-10-12 13:51:49 -07:00
77edbde474 Merge branch 'jk/http-redact-fix' into maint
Sensitive data in the HTTP trace were supposed to be redacted, but
we failed to do so in HTTP/2 requests.

* jk/http-redact-fix:
  http: match headers case-insensitively when redacting
2021-10-12 13:51:48 -07:00
ef09a7fbbe Merge branch 'da/difftool-dir-diff-symlink-fix' into maint
"git difftool --dir-diff" mishandled symbolic links.

* da/difftool-dir-diff-symlink-fix:
  difftool: fix symlink-file writing in dir-diff mode
2021-10-12 13:51:48 -07:00
b59c06092f Merge branch 'cb/cvsserver' into maint
"git cvsserver" had a long-standing bug in its authentication code,
which has finally been corrected (it is unclear and is a separate
question if anybody is seriously using it, though).

* cb/cvsserver:
  Documentation: cleanup git-cvsserver
  git-cvsserver: protect against NULL in crypt(3)
  git-cvsserver: use crypt correctly to compare password hashes
2021-10-12 13:51:48 -07:00
c365967f21 Merge branch 'jk/clone-unborn-head-in-bare' into maint
"git clone" from a repository whose HEAD is unborn into a bare
repository didn't follow the branch name the other side used, which
is corrected.

* jk/clone-unborn-head-in-bare:
  clone: handle unborn branch in bare repos
2021-10-12 13:51:47 -07:00
e61304f21d Merge branch 'en/stash-df-fix' into maint
"git stash", where the tentative change involves changing a
directory to a file (or vice versa), was confused, which has been
corrected.

* en/stash-df-fix:
  stash: restore untracked files AFTER restoring tracked files
  stash: avoid feeding directories to update-index
  t3903: document a pair of directory/file bugs
2021-10-12 13:51:47 -07:00
9cfc01e560 Merge branch 'jk/strvec-typefix' into maint
Correct nr and alloc members of strvec struct to be of type size_t.

* jk/strvec-typefix:
  strvec: use size_t to store nr and alloc
2021-10-12 13:51:47 -07:00
b809c3d900 Merge branch 'en/am-abort-fix' into maint
When "git am --abort" fails to abort correctly, it still exited
with exit status of 0, which has been corrected.

* en/am-abort-fix:
  am: fix incorrect exit status on am fail to abort
  t4151: add a few am --abort tests
  git-am.txt: clarify --abort behavior
2021-10-12 13:51:46 -07:00
b5f309dc7f Merge branch 'ps/update-ref-batch-flush' into maint
"git update-ref --stdin" failed to flush its output as needed,
which potentially led the conversation to a deadlock.

* ps/update-ref-batch-flush:
  t1400: avoid SIGPIPE race condition on fifo
  update-ref: fix streaming of status updates
2021-10-12 13:51:46 -07:00
d64b99d35b Merge branch 'rs/no-mode-to-open-when-appending' into maint
The "mode" word is useless in a call to open(2) that does not
create a new file.  Such a call in the files backend of the ref
subsystem has been cleaned up.

* rs/no-mode-to-open-when-appending:
  refs/files-backend: remove unused open mode parameter
2021-10-12 13:51:46 -07:00
6287460203 Merge branch 'tb/pack-finalize-ordering' into maint
The order in which various files that make up a single (conceptual)
packfile has been reevaluated and straightened up.  This matters in
correctness, as an incomplete set of files must not be shown to a
running Git.

* tb/pack-finalize-ordering:
  pack-objects: rename .idx files into place after .bitmap files
  pack-write: split up finish_tmp_packfile() function
  builtin/index-pack.c: move `.idx` files into place last
  index-pack: refactor renaming in final()
  builtin/repack.c: move `.idx` files into place last
  pack-write.c: rename `.idx` files after `*.rev`
  pack-write: refactor renaming in finish_tmp_packfile()
  bulk-checkin.c: store checksum directly
  pack.h: line-wrap the definition of finish_tmp_packfile()
2021-10-12 13:51:46 -07:00
6aa501aab2 Merge branch 'rs/range-diff-avoid-segfault-with-I' into maint
"git range-diff -I... <range> <range>" segfaulted, which has been
corrected.

* rs/range-diff-avoid-segfault-with-I:
  range-diff: avoid segfault with -I
2021-10-12 13:51:45 -07:00
79c887d29d Merge branch 'ab/reverse-midx-optim' into maint
The code that optionally creates the *.rev reverse index file has
been optimized to avoid needless computation when it is not writing
the file out.

* ab/reverse-midx-optim:
  pack-write: skip *.rev work when not writing *.rev
2021-10-12 13:51:45 -07:00
62a7648a5e Merge branch 'jc/trivial-threeway-binary-merge' into maint
The "git apply -3" code path learned not to bother the lower level
merge machinery when the three-way merge can be trivially resolved
without the content level merge.

* jc/trivial-threeway-binary-merge:
  apply: resolve trivial merge without hitting ll-merge with "--3way"
2021-10-12 13:51:45 -07:00
1af632444b Merge branch 'ab/send-email-config-fix' into maint
Regression fix.

* ab/send-email-config-fix:
  send-email: fix a "first config key wins" regression in v2.33.0
2021-10-12 13:51:44 -07:00
bf4ca3fdd2 Merge branch 'so/diff-index-regression-fix' into maint
Recent "diff -m" changes broke "gitk", which has been corrected.

* so/diff-index-regression-fix:
  diff-index: restore -c/--cc options handling
2021-10-12 13:51:44 -07:00
d9e2677559 Merge branch 'jk/log-warn-on-bogus-encoding' into maint
Doc update plus improved error reporting.

* jk/log-warn-on-bogus-encoding:
  docs: use "character encoding" to refer to commit-object encoding
  logmsg_reencode(): warn when iconv() fails
2021-10-12 13:51:43 -07:00
48939c572c Merge branch 'tk/fast-export-anonymized-tag-fix' into maint
The output from "git fast-export", when its anonymization feature
is in use, showed an annotated tag incorrectly.

* tk/fast-export-anonymized-tag-fix:
  fast-export: fix anonymized tag using original length
2021-10-12 13:51:43 -07:00
f11b01bdd5 Merge branch 'mh/send-email-reset-in-reply-to' into maint
Even when running "git send-email" without its own threaded
discussion support, a threading related header in one message is
carried over to the subsequent message to result in an unwanted
threading, which has been corrected.

* mh/send-email-reset-in-reply-to:
  send-email: avoid incorrect header propagation
2021-10-12 13:51:43 -07:00
e4bea4a67d Merge branch 'sg/set-ceiling-during-tests' into maint
Buggy tests could damage repositories outside the throw-away test
area we created.  We now by default export GIT_CEILING_DIRECTORIES
to limit the damage from such a stray test.

* sg/set-ceiling-during-tests:
  test-lib: set GIT_CEILING_DIRECTORIES to protect the surrounding repository
2021-10-12 13:51:42 -07:00
a45b824097 Merge branch 'jh/sparse-index-resize-fix' into maint
The sparse-index support can corrupt the index structure by storing
a stale and/or uninitialized data, which has been corrected.

* jh/sparse-index-resize-fix:
  sparse-index: copy dir_hash in ensure_full_index()
2021-10-12 13:51:42 -07:00
0a5af02acb Merge branch 'ka/want-ref-in-namespace' into maint
"git upload-pack" which runs on the other side of "git fetch"
forgot to take the ref namespaces into account when handling
want-ref requests.

* ka/want-ref-in-namespace:
  docs: clarify the interaction of transfer.hideRefs and namespaces
  upload-pack.c: treat want-ref relative to namespace
  t5730: introduce fetch command helper
2021-10-12 13:51:42 -07:00
69247e283c Merge branch 'sg/column-nl' into maint
The parser for the "--nl" option of "git column" has been
corrected.

* sg/column-nl:
  column: fix parsing of the '--nl' option
2021-10-12 13:51:41 -07:00
689cfaf9e7 Merge branch 'cb/makefile-apple-clang' into maint
Build update for Apple clang.

* cb/makefile-apple-clang:
  build: catch clang that identifies itself as "$VENDOR clang"
  build: clang version may not be followed by extra words
  build: update detect-compiler for newer Xcode version
2021-10-12 13:51:41 -07:00
474e4f9b55 Merge branch 'rs/branch-allow-deleting-dangling' into maint
"git branch -D <branch>" used to refuse to remove a broken branch
ref that points at a missing commit, which has been corrected.

* rs/branch-allow-deleting-dangling:
  branch: allow deleting dangling branches with --force
2021-10-12 13:51:41 -07:00
cd9a57f6a0 Merge branch 'mt/quiet-with-delayed-checkout' into maint
The delayed checkout code path in "git checkout" etc. were chatty
even when --quiet and/or --no-progress options were given.

* mt/quiet-with-delayed-checkout:
  checkout: make delayed checkout respect --quiet and --no-progress
2021-10-12 13:51:40 -07:00
872c9e67ec Merge branch 'dd/diff-files-unmerged-fix' into maint
"git diff --relative" segfaulted and/or produced incorrect result
when there are unmerged paths.

* dd/diff-files-unmerged-fix:
  diff-lib: ignore paths that are outside $cwd if --relative asked
2021-10-12 13:51:40 -07:00
ae9e6ef35e Merge branch 'rs/git-mmap-uses-malloc' into maint
mmap() imitation used to call xmalloc() that dies upon malloc()
failure, which has been corrected to just return an error to the
caller to be handled.

* rs/git-mmap-uses-malloc:
  compat: let git_mmap use malloc(3) directly
2021-10-12 13:51:40 -07:00
3fb1d4da5e Merge branch 'pw/rebase-r-fixes' into maint
Various bugs in "git rebase -r" have been fixed.

* pw/rebase-r-fixes:
  rebase -r: fix merge -c with a merge strategy
  rebase -r: don't write .git/MERGE_MSG when fast-forwarding
  rebase -i: add another reword test
  rebase -r: make 'merge -c' behave like reword
2021-10-12 13:51:39 -07:00
8b02ffee3f Merge branch 'pw/rebase-skip-final-fix' into maint
Checking out all the paths from HEAD during the last conflicted
step in "git rebase" and continuing would cause the step to be
skipped (which is expected), but leaves MERGE_MSG file behind in
$GIT_DIR and confuses the next "git commit", which has been
corrected.

* pw/rebase-skip-final-fix:
  rebase --continue: remove .git/MERGE_MSG
  rebase --apply: restore some tests
  t3403: fix commit authorship
2021-10-12 13:51:39 -07:00
ff09581e12 Merge branch 'cb/ci-use-upload-artifacts-v1' into maint
Use upload-artifacts v1 (instead of v2) for 32-bit linux, as the
new version has a blocker bug for that architecture.

* cb/ci-use-upload-artifacts-v1:
  ci: use upload-artifacts v1 for dockerized jobs
2021-10-12 13:51:38 -07:00
444b8548b1 Merge branch 'jk/commit-edit-fixup-fix' into maint
"git commit --fixup" now works with "--edit" again, after it was
broken in v2.32.

* jk/commit-edit-fixup-fix:
  commit: restore --edit when combined with --fixup
2021-10-12 13:51:38 -07:00
ee2b241b69 Merge branch 'jk/range-diff-fixes' into maint
"git range-diff" code clean-up.

* jk/range-diff-fixes:
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
2021-10-12 13:51:38 -07:00
1725c4c64b Merge branch 'jk/apply-binary-hunk-parsing-fix' into maint
"git apply" miscounted the bytes and failed to read to the end of
binary hunks.

* jk/apply-binary-hunk-parsing-fix:
  apply: keep buffer/size pair in sync when parsing binary hunks
2021-10-12 13:51:37 -07:00
b20f67a659 Merge branch 'en/pull-conflicting-options' into maint
"git pull" had various corner cases that were not well thought out
around its --rebase backend, e.g. "git pull --ff-only" did not stop
but went ahead and rebased when the history on other side is not a
descendant of our history.  The series tries to fix them up.

* en/pull-conflicting-options:
  pull: fix handling of multiple heads
  pull: update docs & code for option compatibility with rebasing
  pull: abort by default when fast-forwarding is not possible
  pull: make --rebase and --no-rebase override pull.ff=only
  pull: since --ff-only overrides, handle it first
  pull: abort if --ff-only is given and fast-forwarding is impossible
  t7601: add tests of interactions with multiple merge heads and config
  t7601: test interaction of merge/rebase/fast-forward flags and options
2021-10-12 13:51:36 -07:00
6d71443d8e Merge branch 'jt/push-negotiation-fixes' into maint
Bugfix for common ancestor negotiation recently introduced in "git
push" codepath.

* jt/push-negotiation-fixes:
  fetch: die on invalid --negotiation-tip hash
  send-pack: fix push nego. when remote has refs
  send-pack: fix push.negotiate with remote helper
2021-10-12 13:51:36 -07:00
0a15e94e10 Merge branch 'ab/pack-stdin-packs-fix' into maint
Input validation of "git pack-objects --stdin-packs" has been
corrected.

* ab/pack-stdin-packs-fix:
  pack-objects: fix segfault in --stdin-packs option
  pack-objects tests: cover blindspots in stdin handling
2021-10-12 13:51:36 -07:00
a02d78c4ae Merge branch 'en/typofixes' into maint
Typofixes.

* en/typofixes:
  merge-ort: fix completely wrong comment
  trace2.h: fix trivial comment typo
2021-10-12 13:51:35 -07:00
0fadf75a3d Merge branch 'cb/unicode-14' into maint
The unicode character width table (used for output alignment) has
been updated.

* cb/unicode-14:
  unicode: update the width tables to Unicode 14
2021-10-12 13:51:35 -07:00
ee8870a800 Merge branch 'po/git-config-doc-mentions-help-c' into maint
Doc update.

* po/git-config-doc-mentions-help-c:
  doc: config, tell readers of `git help --config`
2021-10-12 13:51:35 -07:00
25ed97c61b Merge branch 'kz/revindex-comment-fix' into maint
Header comment fix.

* kz/revindex-comment-fix:
  pack-revindex.h: correct the time complexity descriptions
2021-10-12 13:51:34 -07:00
32e28fc2de Merge branch 'cb/plug-leaks-in-alloca-emu-users' into maint
Leakfix.

* cb/plug-leaks-in-alloca-emu-users:
  t0000: avoid masking git exit value through pipes
  tree-diff: fix leak when not HAVE_ALLOCA_H
2021-10-12 13:51:34 -07:00
70fb2c9886 Merge branch 'ma/doc-git-version' into maint
Doc update.

* ma/doc-git-version:
  documentation: add documentation for 'git version'
2021-10-12 13:51:34 -07:00
b306aefded Merge branch 'rs/drop-core-compression-vars' into maint
Code clean-up.

* rs/drop-core-compression-vars:
  compression: drop write-only core_compression_* variables
2021-10-12 13:51:33 -07:00
0e17a537f3 Merge branch 'jk/t5562-racefix' into maint
Test update.

* jk/t5562-racefix:
  t5562: use alarm() to interrupt timed child-wait
2021-10-12 13:51:33 -07:00
49c2cbe69a Merge branch 'rs/setup-use-xopen-and-xdup' into maint
Code clean-up.

* rs/setup-use-xopen-and-xdup:
  setup: use xopen and xdup in sanitize_stdfds
2021-10-12 13:51:33 -07:00
f72187eaf5 Merge branch 'jc/prefix-filename-allocates' into maint
Leakfix.

* jc/prefix-filename-allocates:
  hash-object: prefix_filename() returns allocated memory these days
2021-10-12 13:51:32 -07:00
32b6c51888 Merge branch 'ab/no-more-check-bindir' into maint
Build simplification.

* ab/no-more-check-bindir:
  Makefile: remove the check_bindir script
2021-10-12 13:51:32 -07:00
b37c9a0fd1 Merge branch 'bs/doc-bugreport-outdir' into maint
Docfix.

* bs/doc-bugreport-outdir:
  Documentation: fix default directory of git bugreport -o
2021-10-12 13:51:32 -07:00
7bdcb8ef1f Merge branch 'cb/ci-build-pedantic' into maint
CI update.

* cb/ci-build-pedantic:
  ci: run a pedantic build as part of the GitHub workflow
2021-10-12 13:51:31 -07:00
6a4fd600f4 Merge branch 'rs/archive-use-object-id' into maint
Code cleanup.

* rs/archive-use-object-id:
  archive: convert queue_directory to struct object_id
2021-10-12 13:51:31 -07:00
690bd356fe Merge branch 'rs/show-branch-simplify' into maint
Code cleanup.

* rs/show-branch-simplify:
  show-branch: simplify rev_is_head()
2021-10-12 13:51:31 -07:00
d46f95a2bd Merge branch 'cb/remote-ndebug-fix' into maint
Build fix.

* cb/remote-ndebug-fix:
  remote: avoid -Wunused-but-set-variable in gcc with -DNDEBUG
2021-10-12 13:51:30 -07:00
dca0768820 Merge branch 'ab/mailmap-leakfix' into maint
Leakfix.

* ab/mailmap-leakfix:
  mailmap.c: fix a memory leak in free_mailap_{info,entry}()
2021-10-12 13:51:30 -07:00
49b7148778 Merge branch 'ab/gc-log-rephrase' into maint
A pathname in an advice message has been made cut-and-paste ready.

* ab/gc-log-rephrase:
  gc: remove trailing dot from "gc.log" line
2021-10-12 13:51:30 -07:00
ffa6045102 Merge branch 'ba/object-info' into maint
Leakfix.

* ba/object-info:
  protocol-caps.c: fix memory leak in send_info()
2021-10-12 13:51:29 -07:00
fe77a458d1 Merge branch 'es/walken-tutorial-fix' into maint
Typofix.

* es/walken-tutorial-fix:
  doc: fix syntax error and the format of printf
2021-10-12 13:51:29 -07:00
1c23cc1344 Merge branch 'rs/xopen-reports-open-failures' into maint
Error diagnostics improvement.

* rs/xopen-reports-open-failures:
  use xopen() to handle fatal open(2) failures
  xopen: explicitly report creation failures
2021-10-12 13:51:28 -07:00
3de9da8e2c Merge branch 'dd/t6300-wo-gpg-fix' into maint
Test fix.

* dd/t6300-wo-gpg-fix:
  t6300: check for cat-file exit status code
  t6300: don't run cat-file on non-existent object
2021-10-12 13:51:28 -07:00
d4d96982ec Merge branch 'mh/credential-leakfix' into maint
Leak fix.

* mh/credential-leakfix:
  credential: fix leak in credential_apply_config()
2021-10-12 13:51:28 -07:00
fdad5ab3eb Merge branch 'jk/t5323-no-pack-test-fix' into maint
Test fix.

* jk/t5323-no-pack-test-fix:
  t5323: drop mentions of "master"
2021-10-12 13:51:27 -07:00
b40b6187e4 Merge branch 'js/maintenance-launchctl-fix' into maint
"git maintenance" scheduler fix for macOS.

* js/maintenance-launchctl-fix:
  maintenance: skip bootout/bootstrap when plist is registered
  maintenance: create `launchctl` configuration using a lock file
2021-10-12 13:51:27 -07:00
9b338cbefd Merge branch 'ab/rebase-fatal-fatal-fix' into maint
Error message fix.

* ab/rebase-fatal-fatal-fix:
  rebase: emit one "fatal" in "fatal: fatal: <error>"
2021-10-12 13:51:27 -07:00
d79e73a833 Merge branch 'ab/ls-remote-packet-trace' into maint
Debugging aid fix.

* ab/ls-remote-packet-trace:
  ls-remote: set packet_trace_identity(<name>)
2021-10-12 13:51:26 -07:00
5e01fc9d80 Merge branch 'ga/send-email-sendmail-cmd' into maint
Test fix.

* ga/send-email-sendmail-cmd:
  t9001: PATH must not use Windows-style paths
2021-10-12 13:51:26 -07:00
5d54f964c6 Merge branch 'me/t5582-cleanup' into maint
Test fix.

* me/t5582-cleanup:
  t5582: remove spurious 'cd "$D"' line
2021-10-12 13:51:25 -07:00
5586bd2de2 Merge branch 'sg/make-fix-ar-invocation' into maint
Build fix.

* sg/make-fix-ar-invocation:
  Makefile: remove archives before manipulating them with 'ar'
2021-10-12 13:51:25 -07:00
4e408f1060 Merge branch 'ti/tcsh-completion-regression-fix' into maint
Update to the command line completion (in contrib/) for tcsh.

* ti/tcsh-completion-regression-fix:
  completion: tcsh: Fix regression by drop of wrapper functions
2021-10-12 13:51:24 -07:00
c5d1c7028d Merge branch 'fc/completion-updates' into maint
Command line completion updates.

* fc/completion-updates:
  completion: bash: add correct suffix in variables
  completion: bash: fix for multiple dash commands
  completion: bash: fix for suboptions with value
  completion: bash: fix prefix detection in branch.*
2021-10-12 13:51:24 -07:00
b4266066b6 Merge branch 'cb/ci-freebsd-update' into maint
Update FreeBSD CI job

* cb/ci-freebsd-update:
  ci: update freebsd 12 cirrus job
2021-10-12 13:51:24 -07:00
6a0699bccc Merge branch 'cb/builtin-merge-format-string-fix' into maint
Code clean-up.

* cb/builtin-merge-format-string-fix:
  builtin/merge: avoid -Wformat-extra-args from ancient Xcode
2021-10-12 13:51:23 -07:00
367e9feee2 Merge branch 'js/log-protocol-version' into maint
Debugging aid.

* js/log-protocol-version:
  connect, protocol: log negotiated protocol version
2021-10-12 13:51:22 -07:00
77357e806f Merge branch 'en/merge-strategy-docs' into maint
Documentation updates.

* en/merge-strategy-docs:
  Update error message and code comment
  merge-strategies.txt: add coverage of the `ort` merge strategy
  git-rebase.txt: correct out-of-date and misleading text about renames
  merge-strategies.txt: fix simple capitalization error
  merge-strategies.txt: avoid giving special preference to patience algorithm
  merge-strategies.txt: do not imply using copy detection is desired
  merge-strategies.txt: update wording for the resolve strategy
  Documentation: edit awkward references to `git merge-recursive`
  directory-rename-detection.txt: small updates due to merge-ort optimizations
  git-rebase.txt: correct antiquated claims about --rebase-merges
2021-10-12 13:51:22 -07:00
dc79a67841 Merge branch 'ab/bundle-doc' into maint
Doc update.

* ab/bundle-doc:
  bundle doc: replace "basis" with "prerequsite(s)"
  bundle doc: elaborate on rev<->ref restriction
  bundle doc: elaborate on object prerequisites
  bundle doc: rewrite the "DESCRIPTION" section
2021-10-12 13:51:22 -07:00
e578d0311d add: don't write objects with --dry-run
When the option --dry-run/-n is given, "git add" doesn't change the
index, but still writes out new object files.  Only hash the latter
without writing instead to make the run as dry as possible.

Use this opportunity to also make the hash_flags variable unsigned,
to match the index_path() parameter it is used as.

Reported-by: git.mexon@spamgourmet.com
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 13:15:49 -07:00
4ef91a2d79 commit: fix duplication regression in permission error output
Fix a regression in the error output emitted when .git/objects can't
be written to. Before 9c4d6c0297 (cache-tree: Write updated
cache-tree after commit, 2014-07-13) we'd emit only one "insufficient
permission" error, now we'll do so again.

The cause is rather straightforward, we've got WRITE_TREE_SILENT for
the use-case of wanting to prepare an index silently, quieting any
permission etc. error output. Then when we attempt to update to
that (possibly broken) index we'll run into the same errors again.

But with 9c4d6c0297 the gap between the cache-tree API and the object
store wasn't closed in terms of asking write_object_file() to be
silent. I.e. post-9c4d6c0297b the first call is to prepare_index(),
and after that we'll call prepare_to_commit(). We only want verbose
error output from the latter.

So let's add and use that facility with a corresponding HASH_SILENT
flag, its only user is cache-tree.c's update_one(), which will set it
if its "WRITE_TREE_SILENT" flag is set.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 11:16:59 -07:00
119b26d6b9 unwritable tests: assert exact error output
In preparation for fixing a regression where we started emitting some
of these error messages twice, let's assert what the output from "git
commit" and friends is now in the case of permission errors.

As noted in [1] using test_expect_failure to mark up a TODO test has
some unexpected edge cases, e.g. we don't want to break --run=3 by
skipping the "test_lazy_prereq" here. This pattern allows us to test
just the test_cmp (and the "cat", which shouldn't fail) with the added
"test_expect_failure", we'll flip that to a "test_expect_success" in
the next commit.

1. https://lore.kernel.org/git/87tuhmk19c.fsf@evledraar.gmail.com/T/#u

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 11:16:42 -07:00
9d12546de9 ssh signing: fmt-merge-msg tests & config parse
When merging a signed tag fmt-merge-msg was unable to verify its
validity missing the necessary ssh allowedSignersFile config.

Adds gpg config parsing to fmt-merge-msg.
Adds tests for ssh signed tags to fmt-merge-msg tests.

Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-12 10:35:27 -07:00
e8191a5265 Merge branch 'fs/ssh-signing' into fs/ssh-signing-fix
* fs/ssh-signing:
  ssh signing: test that gpg fails for unknown keys
  ssh signing: tests for logs, tags & push certs
  ssh signing: duplicate t7510 tests for commits
  ssh signing: verify signatures using ssh-keygen
  ssh signing: provide a textual signing_key_id
  ssh signing: retrieve a default key from ssh-agent
  ssh signing: add ssh key format and signing code
  ssh signing: add test prereqs
  ssh signing: preliminary refactoring and clean-up
2021-10-12 10:35:19 -07:00
be79131a53 perf: disable automatic housekeeping
Turn off automatic background maintenance for perf tests by default to
avoid interference with performance measurements.  Do that by using the
new file t/perf/config and using it as the system config file for perf
tests.  Future tests intended to measure gc performance can override
the setting locally or call "git gc" explicitly.

This fixes a breakage in p2000 caused by gc automatically emptying the
reflog due its fake dates from 2005 being older than 90 days.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-11 13:17:58 -07:00
2a97289ad8 Twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-11 10:22:14 -07:00
68ef6c0b1a Merge branch 'tb/aggregate-ignore-leading-whitespaces'
Test portability update.

* tb/aggregate-ignore-leading-whitespaces:
  t/perf/aggregate.perl: tolerate leading spaces
2021-10-11 10:21:49 -07:00
252caf8e41 Merge branch 'rs/p3400-lose-tac'
Test portability update.

* rs/p3400-lose-tac:
  p3400: stop using tac(1)
2021-10-11 10:21:49 -07:00
1e50f2a689 Merge branch 'mr/bisect-in-c-4'
Message fix.

* mr/bisect-in-c-4:
  bisect--helper: add space between colon and following sentence
2021-10-11 10:21:49 -07:00
0cc4ec1550 Merge branch 'da/difftool'
Code clean-up in "git difftool".

* da/difftool:
  difftool: add a missing space to the run_dir_diff() comments
  difftool: remove an unnecessary call to strbuf_release()
  difftool: refactor dir-diff to write files using helper functions
  difftool: create a tmpdir path without repeated slashes
2021-10-11 10:21:48 -07:00
404c4a5462 Merge branch 'ab/designated-initializers'
Code clean-up.

* ab/designated-initializers:
  cbtree.h: define cb_init() in terms of CBTREE_INIT
  *.h: move some *_INIT to designated initializers
  *.h _INIT macros: don't specify fields equal to 0
  *.[ch] *_INIT macros: use { 0 } for a "zero out" idiom
  submodule-config.h: remove unused SUBMODULE_INIT macro
2021-10-11 10:21:48 -07:00
859a585bdf Merge branch 'ab/sanitize-leak-ci'
CI learns to run the leak sanitizer builds.

* ab/sanitize-leak-ci:
  tests: add a test mode for SANITIZE=leak, run it in CI
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
2021-10-11 10:21:47 -07:00
62bff959c7 Merge branch 'ab/retire-git-config-key-is-valid'
Code cleanup.

* ab/retire-git-config-key-is-valid:
  config.c: remove unused git_config_key_is_valid()
2021-10-11 10:21:47 -07:00
f6c075ad71 Merge branch 'jk/ref-paranoia'
The ref iteration code used to optionally allow dangling refs to be
shown, which has been tightened up.

* jk/ref-paranoia:
  refs: drop "broken" flag from for_each_fullref_in()
  ref-filter: drop broken-ref code entirely
  ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN
  repack, prune: drop GIT_REF_PARANOIA settings
  refs: turn on GIT_REF_PARANOIA by default
  refs: omit dangling symrefs when using GIT_REF_PARANOIA
  refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag
  refs-internal.h: reorganize DO_FOR_EACH_* flag documentation
  refs-internal.h: move DO_FOR_EACH_* flags next to each other
  t5312: be more assertive about command failure
  t5312: test non-destructive repack
  t5312: create bogus ref as necessary
  t5312: drop "verbose" helper
  t5600: provide detached HEAD for corruption failures
  t5516: don't use HEAD ref for invalid ref-deletion tests
  t7900: clean up some more broken refs
2021-10-11 10:21:47 -07:00
97492aacff Merge branch 'ab/http-pinned-public-key-mismatch'
HTTPS error handling updates.

* ab/http-pinned-public-key-mismatch:
  http: check CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors
2021-10-11 10:21:47 -07:00
4ae0bc75f9 Merge branch 'js/win-lazyload-buildfix'
Compilation fix.

* js/win-lazyload-buildfix:
  Makefile: restrict -Wpedantic and -Wno-pedantic-ms-format better
  lazyload.h: use an even more generic function pointer than FARPROC
  lazyload.h: fix warnings about mismatching function pointer types
2021-10-11 10:21:47 -07:00
ed4d535342 Merge branch 'sg/test-split-index-fix'
Test updates.

* sg/test-split-index-fix:
  read-cache: fix GIT_TEST_SPLIT_INDEX
  tests: disable GIT_TEST_SPLIT_INDEX for sparse index tests
  read-cache: look for shared index files next to the index, too
  t1600-index: disable GIT_TEST_SPLIT_INDEX
  t1600-index: don't run git commands upstream of a pipe
  t1600-index: remove unnecessary redirection
2021-10-11 10:21:47 -07:00
9567a670d2 Merge branch 'tb/midx-write-propagate-namehash'
"git multi-pack-index write --bitmap" learns to propagate the
hashcache from original bitmap to resulting bitmap.

* tb/midx-write-propagate-namehash:
  t5326: test propagating hashcache values
  p5326: generate pack bitmaps before writing the MIDX bitmap
  p5326: don't set core.multiPackIndex unnecessarily
  p5326: create missing 'perf-tag' tag
  midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps
  pack-bitmap.c: propagate namehash values from existing bitmaps
  t/helper/test-bitmap.c: add 'dump-hashes' mode
2021-10-11 10:21:46 -07:00
c4fdba3383 userdiff-cpp: learn the C++ spaceship operator
Since C++20, the language has a generalized comparison operator <=>.
Teach the cpp driver not to separate it into <= and > tokens.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-10 15:24:21 -07:00
637b80cd6a userdiff-cpp: permit the digit-separating single-quote in numbers
Since C++17, the single-quote can be used as digit separator:

   3.141'592'654
   1'000'000
   0xdead'beaf

Make it known to the word regex of the cpp driver, so that numbers are
not split into separate tokens at the single-quotes.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-10 15:24:21 -07:00
bfaaf191a5 userdiff-cpp: prepare test cases with yet unsupported features
We are going to add support for C++'s digit-separating single-quote and
the spaceship operator. By adding the test cases in this separate
commit, the effect on the word highlighting will become more obvious
as the features are implemented and the file cpp/expect is updated.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-10 15:24:21 -07:00
bf972896d7 cat-file: use packed_object_info() for --batch-all-objects
When "cat-file --batch-all-objects" iterates over each object, it knows
where to find each one. But when we look up details of the object, we
don't use that information at all.

This patch teaches it to use the pack/offset pair when we're iterating
over objects in a pack. This yields a measurable speed improvement
(timings on a fully packed clone of linux.git):

  Benchmark #1: ./git.old cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)"
    Time (mean ± σ):      8.128 s ±  0.118 s    [User: 7.968 s, System: 0.156 s]
    Range (min … max):    8.007 s …  8.301 s    10 runs

  Benchmark #2: ./git.new cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)"
    Time (mean ± σ):      4.294 s ±  0.064 s    [User: 4.167 s, System: 0.125 s]
    Range (min … max):    4.227 s …  4.457 s    10 runs

  Summary
    './git.new cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)"' ran
      1.89 ± 0.04 times faster than './git.old cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)"

The implementation is pretty simple: we just call packed_object_info()
instead of oid_object_info_extended() when we can. Most of the changes
are just plumbing the pack/offset pair through the callstack. There is
one subtlety: replace lookups are not handled by packed_object_info().
But since those are disabled for --batch-all-objects, and since we'll
only have pack info when that option is in effect, we don't have to
worry about that.

There are a few limitations to this optimization which we could address
with further work:

 - I didn't bother recording when we found an object loose. Technically
   this could save us doing a fruitless lookup in the pack index. But
   opening and mmap-ing a loose object is so expensive in the first
   place that this doesn't matter much. And if your repository is large
   enough to care about per-object performance, most objects are going
   to be packed anyway.

 - This works only in --unordered mode. For the sorted mode, we'd have
   to record the pack/offset pair as part of our oid-collection. That's
   more code, plus at least 16 extra bytes of heap per object. It would
   probably still be a net win in runtime, but we'd need to measure.

 - For --batch, this still helps us with getting the object metadata,
   but we still do a from-scratch lookup for the object contents. This
   probably doesn't matter that much, because the lookup cost will be
   much smaller relative to the cost of actually unpacking and printing
   the objects.

   For small objects, we could probably swap out read_object_file() for
   using packed_object_info() with a "object_info.contentp" to get the
   contents. But we'd still need to deal with streaming for larger
   objects. A better path forward here is to teach the initial
   oid_object_info_extended() / packed_object_info() calls to retrieve
   the contents of smaller objects while they are already being
   accessed. That would save the extra lookup entirely. But it's a
   non-trivial feature to add to the object_info code, so I left it for
   now.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:45:14 -07:00
818e393084 cat-file: split ordered/unordered batch-all-objects callbacks
When we originally added --batch-all-objects, it stuffed everything into
an oid_array(), and then iterated over that array with a callback to
write the actual output.

When we later added --unordered, that code path writes immediately as we
discover each object, but just calls the same batch_object_cb() as our
entry point to the writing code. That callback has a narrow interface;
it only receives the oid, but we know much more about each object in the
unordered write (which we'll make use of in the next patch). So let's
just call batch_object_write() directly. The callback wasn't saving us
much effort.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:45:14 -07:00
5c5b29b459 cat-file: disable refs/replace with --batch-all-objects
When we're enumerating all objects in the object database, it doesn't
make sense to respect refs/replace. The point of this option is to
enumerate all of the objects in the database at a low level. By
definition we'd already show the replacement object's contents (under
its real oid), and showing those contents under another oid is almost
certainly working against what the user is trying to do.

Note that you could make the same argument for something like:

  git show-index <foo.idx |
  awk '{print $2}' |
  git cat-file --batch

but there we can't know in cat-file exactly what the user intended,
because we don't know the source of the input. They could be trying to
do low-level debugging, or they could be doing something more high-level
(e.g., imagine a porcelain built around cat-file for its object
accesses). So in those cases, we'll have to rely on the user specifying
"git --no-replace-objects" to tell us what to do.

One _could_ make an argument that "cat-file --batch" is sufficiently
low-level plumbing that it should not respect replace-objects at all
(and the caller should do any replacement if they want it).  But we have
been doing so for some time. The history is a little tangled:

  - looking back as far as v1.6.6, we would not respect replace refs for
    --batch-check, but would for --batch (because the former used
    sha1_object_info(), and the replace mechanism only affected actual
    object reads)

  - this discrepancy was made even weirder by 98e2092b50 (cat-file:
    teach --batch to stream blob objects, 2013-07-10), where we always
    output the header using the --batch-check code, and then printed the
    object separately. This could lead to "cat-file --batch" dying (when
    it notices the size or type changed for a non-blob object) or even
    producing bogus output (in streaming mode, we didn't notice that we
    wrote the wrong number of bytes).

  - that persisted until 1f7117ef7a (sha1_file: perform object
    replacement in sha1_object_info_extended(), 2013-12-11), which then
    respected replace refs for both forms.

So it has worked reliably this way for over 7 years, and we should make
sure it continues to do so. That could also be an argument that
--batch-all-objects should not change behavior (which this patch is
doing), but I really consider the current behavior to be an unintended
bug. It's a side effect of how the code is implemented (feeding the oids
back into oid_object_info() rather than looking at what we found while
reading the loose and packed object storage).

The implementation is straight-forward: we just disable the global
read_replace_refs flag when we're in --batch-all-objects mode. It would
perhaps be a little cleaner to change the flag we pass to
oid_object_info_extended(), but that's not enough. We also read objects
via read_object_file() and stream_blob_to_fd(). The former could switch
to its _extended() form, but the streaming code has no mechanism for
disabling replace refs. Setting the global flag works, and as a bonus,
it's impossible to have any "oops, we're sometimes replacing the object
and sometimes not" bugs in the output (like the ones caused by
98e2092b50 above).

The tests here cover the regular-input and --batch-all-objects cases,
for both --batch-check and --batch. There is a test in t6050 that covers
the regular-input case with --batch already, but this new one goes much
further in actually verifying the output (plus covering --batch-check
explicitly). This is perhaps a little overkill and the tests would be
simpler just covering --batch-check, but I wanted to make sure we're
checking that --batch output is consistent between the header and the
content. The global-flag technique used here makes that easy to get
right, but this is future-proofing us against regressions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:45:14 -07:00
c3660cfb03 cat-file: mention --unordered along with --batch-all-objects
The note on ordering for --batch-all-objects was written when that was
the only possible ordering. These days we have --unordered, too, so
let's point to it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:45:14 -07:00
e879295b20 t1006: clean up broken objects
A few of the tests create intentionally broken objects with broken
types. Let's clean them up after we're done with them, so that later
tests don't get confused (we hadn't noticed because this only affects
tests which use --batch-all-objects, but I'm about to add more).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:45:14 -07:00
71ef66d740 submodule: trace adding submodule ODB as alternate
Submodule ODBs are never added as alternates during the execution of the
test suite, but there may be a rare interaction that the test suite does
not have coverage of. Add a trace message when this happens, so that
users who trace their commands can notice such occurrences.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:06:06 -07:00
13a2f620b2 submodule: pass repo to check_has_commit()
Pass the repo explicitly when calling check_has_commit() to avoid
relying on add_submodule_odb(). With this commit and the parent commit,
the last remaining tests no longer rely on add_submodule_odb(), so mark
these tests accordingly.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:06:06 -07:00
eef71904ff object-file: only register submodule ODB if needed
In a35e03dee0 ("submodule: lazily add submodule ODBs as alternates",
2021-09-08), Git was taught to add all known submodule ODBs as
alternates when attempting to read an object that doesn't exist, as a
fallback for when a submodule object is read as if it were in
the_repository. However, this behavior wasn't restricted to happen only
when reading from the_repository. Fix this.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:06:06 -07:00
155b517d5c merge-{ort,recursive}: remove add_submodule_odb()
After the parent commit and some of its ancestors, the only place
commits are being accessed through alternates is in the user-facing
message formatting code. Fix those, and remove the add_submodule_odb()
calls.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:06:06 -07:00
8788195c88 refs: peeling non-the_repository iterators is BUG
There is currently no support for peeling the current ref of an iterator
iterating over a non-the_repository ref store, and none is needed. Thus,
for now, BUG() if that happens.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:06:06 -07:00
9bc45a2802 refs: teach arbitrary repo support to iterators
Note that should_pack_ref() is called when writing refs, which is only
supported for the_repository, hence the_repository is hardcoded there.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:06:05 -07:00
34224e14d6 refs: plumb repo into ref stores
In preparation for the next 2 patches that adds (partial) support for
arbitrary repositories to ref iterators, plumb a repository into all ref
stores. There are no changes to program logic.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:06:05 -07:00
6a5c337922 pretty: colorize pattern matches in commit messages
The "git log" command limits its output to the commits that contain strings
matched by a pattern when the "--grep=<pattern>" option is used, but unlike
output from "git grep -e <pattern>", the matches are not highlighted,
making them harder to spot.

Teach the pretty-printer code to highlight matches from the
"--grep=<pattern>", "--author=<pattern>" and "--committer=<pattern>"
options (to view the last one, you may have to ask for --pretty=fuller).

Also, it must be noted that we are effectively greping the content twice
(because it would be a hassle to rework the existing matching code to do
a /g match and then pass it all down to the coloring code), however it only
slows down "git log --author=^H" on this repository by around 1-2%
(compared to v2.33.0), so it should be a small enough slow down to justify
the addition of the feature.

Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:19:14 -07:00
d342834529 parse-options: change OPT_{SHORT,UNSET} to an enum
Change the comparisons against OPT_SHORT and OPT_UNSET to an enum
which keeps track of how a given option got parsed. The case of "0"
was an implicit OPT_LONG, so let's add an explicit label for it.

Due to the xor in 0f1930c587 (parse-options: allow positivation of
options starting, with no-, 2012-02-25) the code already relied on
this being set back to 0. To avoid refactoring the logic involved in
that let's just start the enum at "0" instead of the usual "1<<0" (1),
but BUG() out if we don't have one of our expected flags.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
62f2ffc563 parse-options tests: test optname() output
There were no tests for checking the specific output that we'll
generate in optname(), let's add some. That output was added back in
4a59fd1312 (Add a simple option parser., 2007-10-15).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
28794ec72e parse-options.[ch]: make opt{bug,name}() "static"
Change these two functions to "static", the last user of "optname()"
outside of parse-options.c itself went away in the preceding commit,
for the reasons noted in 9440b831ad (parse-options: replace
opterror() with optname(), 2018-11-10) we shouldn't be adding any more
users of it.

The "optbug()" function was never used outside of parse-options.c, but
was made non-static in 1f275b7c4c (parse-options: export opterr,
optbug, 2011-08-11). I think the only external user of optname() was
the commit-graph.c caller added in 09e0327f57 (builtin/commit-graph.c:
introduce '--max-new-filters=<n>', 2020-09-18).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
13d9fcec29 commit-graph: stop using optname()
Stop using optname() in builtin/commit-graph.c to emit an error with
the --max-new-filters option. This changes code added in 809e0327f5
(builtin/commit-graph.c: introduce '--max-new-filters=<n>',
2020-09-18).

See 9440b831ad (parse-options: replace opterror() with optname(),
2018-11-10) for why using optname() like this is considered bad,
i.e. it's assembling human-readable output piecemeal, and the "option
`X'" at the start can't be translated.

It didn't matter in this case, but this code was also buggy in its use
of "opt->flags" to optname(), that function expects flags, but not
*those* flags.

Let's pass "max-new-filters" to the new error because the option name
isn't translatable, and because we can re-use a translation added in
f7e68a0878 (parse-options: check empty value in OPT_INTEGER and
OPT_ABBREV, 2019-05-29).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
3c2047a711 parse-options.c: move optname() earlier in the file
In preparation for making "optname" a static function move it above
its first user in parse-options.c.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
7bf7f0ba05 parse-options.h: make the "flags" in "struct option" an enum
Change the "flags" members of "struct option" to refer to their
corresponding "enum" defined earlier in the file.

The benefit of changing this to an enum isn't as great as with some
"enum parse_opt_type" as we'll always check this as a bitfield, so we
can't rely on the compiler checking "case" arms for us. But let's do
it for consistency with the rest of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
1b887353d7 parse-options.c: use exhaustive "case" arms for "enum parse_opt_result"
Change the "default" case in parse_options() that handles the return
value of parse_options_step() to simply have a "case" arm for
PARSE_OPT_UNKNOWN, instead of leaving it to a comment. This means the
compiler can warn us about any missing case arms.

This adjusts code added in ff43ec3e2d (parse-opt: create
parse_options_step., 2008-06-23), given its age it may pre-date the
existence (or widespread use) of this coding style, which we've since
adopted more widely.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
352e761388 parse-options.[ch]: consistently use "enum parse_opt_result"
Use the "enum parse_opt_result" instead of an "int flags" as the
return value of the applicable functions in parse-options.c.

This will help catch future bugs, such as the missing "case" arms in
the two existing users of the API in "blame.c" and "shortlog.c". A
third caller in 309be813c9 (update-index: migrate to parse-options
API, 2010-12-01) was already checking for these.

As can be seen when trying to sort through the deluge of warnings
produced when compiling this with CC=g++ (mostly unrelated to this
change) we're not consistently using "enum parse_opt_result" even now,
i.e. we'll return error() and "return 0;". See f41179f16b
(parse-options: avoid magic return codes, 2019-01-27) for a commit
which started changing some of that.

I'm not doing any more of that exhaustive migration here, and it's
probably not worthwhile past the point of being able to check "enum
parse_opt_result" in switch().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
3f9ab7ccde parse-options.[ch]: consistently use "enum parse_opt_flags"
Use the "enum parse_opt_flags" instead of an "int flags" as arguments
to the various functions in parse-options.c.

Even though this is an enum bitfield there's there's a benefit to
doing this when it comes to the wider C ecosystem. E.g. the GNU
debugger (gdb) will helpfully detect and print out meaningful enum
labels in this case. Here's the output before and after when breaking
in "parse_options()" after invoking "git stash show":

Before:

    (gdb) p flags
    $1 = 9

After:

    (gdb) p flags
    $1 = (PARSE_OPT_KEEP_DASHDASH | PARSE_OPT_KEEP_UNKNOWN)

Of course as noted in[1] there's a limit to this smartness,
i.e. manually setting it with unrelated enum labels won't be
caught. There are some third-party extensions to do more exhaustive
checking[2], perhaps we'll be able to make use of them sooner than
later.

We've also got prior art using this pattern in the codebase. See
e.g. "enum bloom_filter_computed" added in 312cff5207 (bloom: split
'get_bloom_filter()' in two, 2020-09-16) and the "permitted" enum
added in ce910287e7 (add -p: fix checking of user input, 2020-08-17).

1. https://lore.kernel.org/git/87mtnvvj3c.fsf@evledraar.gmail.com/
2. https://github.com/sinelaw/elfs-clang-plugins/blob/master/enums_conversion/README.md

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
8c32856133 blame: document --color-* options
Commit cdc2d5f11f (builtin/blame: dim uninteresting metadata lines,
2018-04-23) and 25d5f52901 (builtin/blame: highlight recently changed
lines, 2018-04-23) introduce --color-lines and --color-by-age options to
git blame, respectively. While both options are mentioned in usage help,
they aren't documented in git-blame(1). Document them.

Co-authored-by: Dr. Matthias St. Pierre <m.st.pierre@ncp-e.com>
Signed-off-by: Dr. Matthias St. Pierre <m.st.pierre@ncp-e.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:05:43 -07:00
350b87cd65 userdiff-cpp: tighten word regex
Generally, word regex can be written such that they match tokens
liberally and need not model the actual syntax because it can be assumed
that the regex will only be applied to syntactically correct text.

The regex for cpp (C/C++) is too liberal, though. It regards these
sequences as single tokens:

   1+2
   1.5-e+2+f

and the following amalgams as one token:

   .l      as in str.length
   .f      as in str.find
   .e      as in str.erase

Tighten the regex in the following way:

- Accept + and - only in one position in the exponent. + and - are no
  longer regarded as the sign of a number and are treated by the
  catcher-all that is not visible in the driver's regex.

- Accept a leading decimal point only when it is followed by a digit.

For readability, factor hex- and binary numbers into an own term.

As a drive-by, this fixes that floating point numbers such as 12E5
(with upper-case E) were split into two tokens.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 13:04:07 -07:00
3e063de46e t4034: add tests showing problematic cpp tokenizations
The word regex is too loose and matches long streaks of characters
that should actually be separate tokens.  Add these problematic test
cases. Separate the lines with text that will remain identical in the
pre- and post-image so that the diff algorithm will not lump removals
and additions of consecutive lines together. This makes the expected
output easier to read.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 13:04:07 -07:00
1cf93847c1 t4034/cpp: actually test that operator tokens are not split
8d96e7288f (t4034: bulk verify builtin word regex sanity, 2010-12-18)
added many tests with the intent to verify that operators consisting of
more than one symbol are kept together. These are tested by probing a
transition from, e.g., a!=b to x!=y, which results in the word-diff

  [-a-]{+x+}!=[-b-]{+y+}

But that proves only that the letters and operators are separate tokens.
To prove that != is an unseparable token, we have to probe a transition
from, e.g., a=b to a!=b having a word-diff

  a[-=-]{+!=+}b

that proves that the ! is not separate from the =.

In the post-image, add to or remove from operators a character that
turns it into another valid operator.

Change the identifiers used around operators such that the diff
algorithm does not have an incentive to match, e.g., a<b in one spot
in the pre-image with a<b elsewhere in the post-image.

Adjust the expected output to match the new differences. Notice that
there are some undesirable tokenizations around e, ., and -.  This will
be addressed in a later change.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 13:04:07 -07:00
c90cfc225b test-mergesort: use repeatable random numbers
Use MINSTD to generate pseudo-random numbers consistently instead of
using rand(3), whose output can vary from system to system, and reset
its seed before filling in the test values.  This gives repeatable
results across versions and systems, which simplifies sharing and
comparing of results between developers.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 10:04:56 -07:00
c8ad9d04c6 read-cache: let verify_path() reject trailing dir separators again
6e773527b6 (sparse-index: convert from full to sparse, 2021-03-30) made
verify_path() accept trailing directory separators for directories,
which is necessary for sparse directory entries.  This clemency causes
"git stash" to stumble over sub-repositories, though, and there may be
more unintended side-effects.

Avoid them by restoring the old verify_path() behavior and accepting
trailing directory separators only in places that are supposed to handle
sparse directory entries.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 17:52:26 -07:00
2a1ae649a4 read-cache: add verify_path_internal()
Turn verify_path() into an internal function that distinguishes between
valid paths and those with trailing directory separators and rename it
to verify_path_internal().  Provide a wrapper with the old behavior
under the old name.  No functional change intended.  The new function
will be used in the next patch.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 17:49:39 -07:00
fc5e90b848 t3905: show failure to ignore sub-repo
"git stash" used to ignore sub-repositories until 6e773527b6
(sparse-index: convert from full to sparse, 2021-03-30).  Add a test
that demonstrates this regression.

Reported-by: Robert Leftwich <robert@gitpod.io>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 17:47:39 -07:00
465028e0e2 merge: add missing strbuf_release()
We strbuf_reset() this "struct strbuf" in a loop earlier, but never
freed it. Plugs a memory leak that's been here ever since this code
got introduced in 1c7b76be7d (Build in merge, 2008-07-07).

This takes us from 68 failed tests in "t7600-merge.sh" to 59 under
SANITIZE=leak, and makes "t7604-merge-custom-message.sh" pass!

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 15:40:16 -07:00
272f0a574d ls-files: add missing string_list_clear()
Fix a memory leak that's been here ever since 72aeb18772 (clean.c,
ls-files.c: respect encapsulation of exclude_list_groups, 2013-01-16),
we dup'd the argument in option_parse_exclude(), but never freed the
string_list.

This makes almost all of t3001-ls-files-others-exclude.sh pass (it had
a lot of failures before). Let's mark it as passing with
TEST_PASSES_SANITIZE_LEAK=true, and then exclude the tests that still
failed with a !SANITIZE_LEAK prerequisite check until we fix those
leaks. We can still see the failed tests under
GIT_TEST_FAIL_PREREQS=true.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 15:40:15 -07:00
eab4ac6a23 ls-files: fix a trivial dir_clear() leak
Fix an edge case that was missed when the dir_clear() call was added
in eceba53214 (dir: fix problematic API to avoid memory leaks,
2020-08-18), we need to also clean up when we're about to exit with
non-zero.

That commit says, on the topic of the dir_clear() API and UNLEAK():

    [...]two of them clearly thought about leaks since they had an
    UNLEAK(dir) directive, which to me suggests that the method to
    free the data was too unclear.

I think that 0e5bba53af (add UNLEAK annotation for reducing leak
false positives, 2017-09-08) which added the UNLEAK() makes it clear
that that wasn't the case, rather it was the desire to avoid the
complexity of freeing the memory at the end of the program.

This does add a bit of complexity, but I think it's worth it to just
fix these leaks when it's easy in built-ins. It allows them to serve
as canaries for underlying APIs that shouldn't be leaking, it
encourages us to make those freeing APIs nicer for all their users,
and it prevents other leaking regressions by being able to mark the
entire test as TEST_PASSES_SANITIZE_LEAK=true.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 15:40:15 -07:00
6ad66ab45e tests: fix test-oid-array leak, test in SANITIZE=leak
Fix a trivial memory leak present ever since 38d905bf58 (sha1-array:
add test-sha1-array and basic tests, 2014-10-01), now that that's
fixed we can test this under GIT_TEST_PASSING_SANITIZE_LEAK=true.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 15:40:15 -07:00
926d233035 tests: fix a memory leak in test-oidtree.c
Fix a memory leak in t/helper/test-oidtree.c, we were not freeing the
"struct strbuf" we used for the stdin input we parsed. This leak has
been here ever since 92d8ed8ac1 (oidtree: a crit-bit tree for
odb_loose_cache, 2021-07-07).

Now that it's fixed we can declare that t0069-oidtree.sh will pass
under GIT_TEST_PASSING_SANITIZE_LEAK=true.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 15:40:15 -07:00
c0b80e05f7 tests: fix a memory leak in test-parse-options.c
Fix a memory leak in t/helper/test-parse-options.c, we were not
freeing the allocated "struct string_list" or its items. Let's move
the declaration of the "list" variable into the cmd__parse_options()
and release it at the end.

In c8ba163916 (parse-options: add OPT_STRING_LIST helper, 2011-06-09)
the "list" variable was added, and later on in
c8ba163916 (parse-options: add OPT_STRING_LIST helper, 2011-06-09)
the "expect" was added.

The "list" variable was last touched in 2721ce21e4 (use string_list
initializer consistently, 2016-06-13), but it was still left at the
static scope, it's better to move it to the function for consistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 15:40:15 -07:00
6a75658c0a tests: fix a memory leak in test-prio-queue.c
Fix a memory leak in t/helper/test-prio-queue.c, the lack of freeing
the memory with clear_prio_queue() has been there ever since this code
was originally added in b4b594a315 (prio-queue: priority queue of
pointers to structs, 2013-06-06).

By fixing this leak we can cleanly run t0009-prio-queue.sh under
SANITIZE=leak, so annotate it as such with
TEST_PASSES_SANITIZE_LEAK=true.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 15:40:15 -07:00
6cb3deb451 Merge branch 'ab/sanitize-leak-ci' into ab/mark-leak-free-tests-more
* ab/sanitize-leak-ci:
  tests: add a test mode for SANITIZE=leak, run it in CI
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
2021-10-07 15:39:59 -07:00
25dc57bac8 Merge branch 'ab/sanitize-leak-ci' into ab/mark-leak-free-tests
* ab/sanitize-leak-ci:
  tests: add a test mode for SANITIZE=leak, run it in CI
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
2021-10-07 15:36:00 -07:00
e5a917fcf4 unpack-trees: don't leak memory in verify_clean_subdirectory()
Fix two different but related memory leaks in
verify_clean_subdirectory(). We leaked both the "pathbuf" if
read_directory() returned non-zero, and we never cleaned up our own
"struct dir_struct" either.

 * "pathbuf": When the read_directory() call followed by the
   free(pathbuf) was added in c81935348b (Fix switching to a branch
   with D/F when current branch has file D., 2007-03-15) we didn't
   bother to free() before we called die().

   But when this code was later libified in 203a2fe117 (Allow callers
   of unpack_trees() to handle failure, 2008-02-07) we started to leak
   as we returned data to the caller. This fixes that memory leak,
   which can be observed under SANITIZE=leak with e.g. the
   "t1001-read-tree-m-2way.sh" test.

 * "struct dir_struct": We've leaked the dir_struct ever since this
   code was added back in c81935348b.

   When that commit was written there wasn't an equivalent of
   dir_clear(). Since it was added in 270be81604 (dir.c: provide
   clear_directory() for reclaiming dir_struct memory, 2013-01-06)
   we've omitted freeing the memory allocated here.

   This memory leak could also be observed under SANITIZE=leak and the
   "t1001-read-tree-m-2way.sh" test.

This makes all the test in "t1001-read-tree-m-2way.sh" pass under
"GIT_TEST_PASSING_SANITIZE_LEAK=true", we'd previously die in tests
25, 26 & 28.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 15:29:02 -07:00
9d05b459c7 Merge branch 'ab/sanitize-leak-ci' into ab/unpack-trees-leakfix
* ab/sanitize-leak-ci:
  tests: add a test mode for SANITIZE=leak, run it in CI
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
2021-10-07 15:28:38 -07:00
f751097be3 sparse index: fix use-after-free bug in cache_tree_verify()
In a sparse index it is possible for the tree that is being verified
to be freed while it is being verified. This happens when the index is
sparse but the cache tree is not and index_name_pos() looks up a path
from the cache tree that is a descendant of a sparse index entry. That
triggers a call to ensure_full_index() which frees the cache tree that
is being verified.  Carrying on trying to verify the tree after this
results in a use-after-free bug. Instead restart the verification if a
sparse index is converted to a full index. This bug is triggered by a
call to reset_head() in "git rebase --apply". Thanks to René Scharfe
and Derrick Stolee for their help analyzing the problem.

==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080
READ of size 4 at 0x606000001b20 thread T0
    #0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863
    #1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
    #14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d)

0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58)
freed by thread T0 here:
    #0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127
    #1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35
    #2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310
    #6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588
    #7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850
    #8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

previously allocated by thread T0 here:
    #0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140
    #2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17
    #3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763
    #4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779
    #7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85
    #8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 14:20:01 -07:00
e861b09636 test-read-midx: fix leak of bitmap_index struct
In read_midx_preferred_pack(), we open the bitmap index but never free
it. This isn't a big deal since this is just a test helper, and we exit
immediately after, but since we're trying to keep our leak-checking tidy
now, it's worth fixing.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07 11:01:22 -07:00
106298f7f9 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-06 13:40:32 -07:00
e10bfe7b33 Merge branch 'ab/retire-decl-of-missing-unused-funcs'
Remove external declaration of functions that no longer exist.

* ab/retire-decl-of-missing-unused-funcs:
  config.h: remove unused git_config_get_untracked_cache() declaration
  log-tree.h: remove unused function declarations
  grep.h: remove unused grep_threads_ok() declaration
  builtin.h: remove cmd_tar_tree() declaration
2021-10-06 13:40:14 -07:00
16119bac40 Merge branch 'lh/systemd-timers'
Testfix.

* lh/systemd-timers:
  maintenance: fix test t7900-maintenance.sh
2021-10-06 13:40:13 -07:00
7ca97f5222 Merge branch 'ab/retire-string-list-init'
Code cleanup.

* ab/retire-string-list-init:
  string-list.[ch]: remove string_list_init() compatibility function
2021-10-06 13:40:13 -07:00
870de59af0 Merge branch 'ab/retire-refs-unused-funcs'
Code cleanup.

* ab/retire-refs-unused-funcs:
  refs/ref-cache.[ch]: remove "incomplete" from create_dir_entry()
  refs/ref-cache.c: remove "mkdir" parameter from find_containing_dir()
  refs/ref-cache.[ch]: remove unused add_ref_entry()
  refs/ref-cache.[ch]: remove unused remove_entry_from_dir()
  refs.[ch]: remove unused ref_storage_backend_exists()
2021-10-06 13:40:13 -07:00
d3347c4b42 Merge branch 'os/status-docfix'
Docfix.

* os/status-docfix:
  doc: fix capitalization in "git status --porcelain=v2" description
2021-10-06 13:40:13 -07:00
6926f2e135 Merge branch 'ws/refer-to-forkpoint-config-in-rebase-doc'
Doc update.

* ws/refer-to-forkpoint-config-in-rebase-doc:
  Document `rebase.forkpoint` in rebase man page
2021-10-06 13:40:12 -07:00
4513972086 Merge branch 'gc/doc-first-contribution-reroll'
Doc update.

* gc/doc-first-contribution-reroll:
  MyFirstContribution: Document --range-diff option when writing v2
2021-10-06 13:40:12 -07:00
5a5ea9763c Merge branch 'pw/rebase-reread-todo-after-editing'
The code to re-read the edited todo list in "git rebase -i" was
made more robust.

* pw/rebase-reread-todo-after-editing:
  rebase: fix todo-list rereading
  sequencer.c: factor out a function
2021-10-06 13:40:12 -07:00
b39b0e1a82 Merge branch 'ew/midx-doc-update'
Doc tweak.

* ew/midx-doc-update:
  doc/technical: update note about core.multiPackIndex
2021-10-06 13:40:12 -07:00
d8d33378ed Merge branch 'ab/repo-settings-cleanup'
Code cleanup.

* ab/repo-settings-cleanup:
  repository.h: don't use a mix of int and bitfields
  repo-settings.c: simplify the setup
  read-cache & fetch-negotiator: check "enum" values in switch()
  environment.c: remove test-specific "ignore_untracked..." variable
  wrapper.c: add x{un,}setenv(), and use xsetenv() in environment.c
2021-10-06 13:40:11 -07:00
ed45be7634 Merge branch 'jk/grep-haystack-is-read-only'
Code clean-up in the "grep" machinery.

* jk/grep-haystack-is-read-only:
  grep: store grep_source buffer as const
  grep: mark "haystack" buffers as const
  grep: stop modifying buffer in grep_source_1()
  grep: stop modifying buffer in show_line()
  grep: stop modifying buffer in strip_timestamp
2021-10-06 13:40:11 -07:00
844cc43377 Merge branch 'tb/commit-graph-usage-fix'
Regression in "git commit-graph" command line parsing has been
corrected.

* tb/commit-graph-usage-fix:
  builtin/multi-pack-index.c: disable top-level --[no-]progress
  builtin/commit-graph.c: don't accept common --[no-]progress
2021-10-06 13:40:11 -07:00
7cebe73dbd Merge branch 'pw/rebase-of-a-tag-fix'
"git rebase <upstream> <tag>" failed when aborted in the middle, as
it mistakenly tried to write the tag object instead of peeling it
to HEAD.

* pw/rebase-of-a-tag-fix:
  rebase: dereference tags
  rebase: use lookup_commit_reference_by_name()
  rebase: use our standard error return value
  t3407: rework rebase --quit tests
  t3407: strengthen rebase --abort tests
  t3407: use test_path_is_missing
  t3407: rename a variable
  t3407: use test_cmp_rev
  t3407: use test_commit
  t3407: run tests in $TEST_DIRECTORY
2021-10-06 13:40:11 -07:00
921c795c25 Merge branch 'jt/add-submodule-odb-clean-up'
More code paths that use the hack to add submodule's object
database to the set of alternate object store have been cleaned up.

* jt/add-submodule-odb-clean-up:
  revision: remove "submodule" from opt struct
  repository: support unabsorbed in repo_submodule_init
  submodule: remove unnecessary unabsorbed fallback
2021-10-06 13:40:11 -07:00
3d411afabc editor: save and reset terminal after calling EDITOR
When EDITOR is invoked to modify a commit message, it will likely
change the terminal settings, and if it misbehaves will leave the
terminal output damaged as shown in a recent report from Windows
Terminal[1]

Instead use the functions provided by compat/terminal to save the
settings and recover safely.

[1] https://github.com/microsoft/terminal/issues/9359

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-06 08:53:03 -07:00
e22b245ea5 terminal: teach git how to save/restore its terminal settings
Currently, git will share its console with all its children (unless
they create their own), and is therefore possible that any of them
that might change the settings for it could affect its operations once
completed.

Refactor the platform specific functionality to save the terminal
settings and expand it to also do so for the output handler.

This will allow for the state of the terminal to be saved and
restored around a child that might misbehave (ex vi) which will
be implemented next.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-06 08:53:00 -07:00
b9e4d84878 t/perf/perf-lib.sh: remove test_times.* at the end test_perf_()
Teach test_perf_() to remove the temporary test_times.* files
at the end of each test.

test_perf_() runs a particular GIT_PERF_REPEAT_COUNT times and creates
./test_times.[123...].  It then uses a perl script to find the minimum
over "./test_times.*" (note the wildcard) and writes that time to
"test-results/<testname>.<testnumber>.result".

If the repeat count is changed during the pXXXX test script, stale
test_times.* files (from previous steps) may be included in the min()
computation.  For example:

...
GIT_PERF_REPEAT_COUNT=3 \
test_perf "status" "
	git status
"

GIT_PERF_REPEAT_COUNT=1 \
test_perf "checkout other" "
	git checkout other
"
...

The time reported in the summary for "XXXX.2 checkout other" would
be "min( checkout[1], status[2], status[3] )".

We prevent that error by removing the test_times.* files at the end of
each test.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-04 22:01:08 -07:00
76f3b69896 t/perf/aggregate.perl: tolerate leading spaces
When using `test_size` with `wc -c`, users on certain platforms can run
into issues when `wc` emits leading space characters in its output,
which confuses get_times.

Callers could switch to use test_file_size instead of `wc -c` (the
former never prints leading space characters, so will always work with
test_size regardless of platform), but this is an easy enough spot to
miss that we should teach get_times to be more tolerant of the input it
accepts.

Teach get_times to do just that by stripping any leading space
characters.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-04 14:12:28 -07:00
d2a534c515 Documentation/git-status: mention how to detect copies
The man page documents that git-status can find copies, but does not
mention how. Whereas git-diff has command line options -C, there is
no such option for git-status - it will only detect copies when the
"status.renames" config option is "copies" or "copy". Document that
in git-status.txt because this has confused me and others[1].

[1]: https://www.reddit.com/r/git/comments/ppc2l9/how_to_get_a_file_with_copied_status/

Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-04 13:07:18 -07:00
56c4d7f6a9 Documentation/git-status: document porcelain status T (typechange)
As reported in [1], T is missing from the description of porcelain
status letters in git-status(1) (whereas T *is* documented in
git-diff-files(1) and friends). Document T right after M (modified)
because the two are very similar.

[1] https://github.com/fish-shell/fish-shell/issues/8311

Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-04 13:07:18 -07:00
55e7f52b40 Documentation/diff-format: state in which cases porcelain status is T
Porcelain status letter T is documented as "type of the file", which
is technically correct but not enough information for users that are
not so familiar with this term from systems programming. In particular,
given that the only supported file types are "regular file", "symbolic
link" and "submodule", the term "file type" is surely opaque to the
many(?) users who are not aware that symbolic links can be tracked -
I thought that a "chmod +x" could cause the T status (wrong, it's M).

Explicitly document the three file types so users know if/how they
want to handle this.

Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-04 13:07:18 -07:00
1566cdd4ae Documentation/git-status: remove impossible porcelain status DR and DC
Commit 176ea74793 ("wt-status.c: handle worktree renames", 2017-12-27)
made a porcelain status like .R or .C possible. They occur only when
the source file is added to the index and the destination file is
added with --intent-to-add.

They also documented DR, but that status is impossible.  The index
change D means that the source file does not exist in the index.
The worktree change R/C states that the file has been renamed/copied
since the index, but that's impossible if it did not exist there.

Reported-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-04 13:07:18 -07:00
100c2da2d3 p3400: stop using tac(1)
b3dfeebb92 (rebase: avoid computing unnecessary patch IDs, 2016-07-29)
added a perf test that calls tac(1) from GNU core utilities.  Support
systems without it by reversing the generated list using sort -nr
instead.  sort(1) with options -n and -r is already used in other tests.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-03 22:07:21 -07:00
0785eb7698 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-03 21:49:43 -07:00
3a757d0369 Merge branch 'ah/connect-parse-feature-v0-fix'
Protocol v0 clients can get stuck parsing a malformed feature line.

* ah/connect-parse-feature-v0-fix:
  connect: also update offset for features without values
2021-10-03 21:49:21 -07:00
93cccedb8f Merge branch 'ab/make-compdb-fix'
Build update.

* ab/make-compdb-fix:
  Makefile: pass -Wno-pendantic under GENERATE_COMPILATION_DATABASE=yes
2021-10-03 21:49:21 -07:00
09bde81f29 Merge branch 'bs/difftool-msg-tweak'
Message tweak.

* bs/difftool-msg-tweak:
  difftool: fix word spacing in the usage strings
2021-10-03 21:49:21 -07:00
068966d2e8 Merge branch 'rs/close-pack-leakfix'
Leakfix.

* rs/close-pack-leakfix:
  packfile: release bad_objects in close_pack()
2021-10-03 21:49:20 -07:00
0a2b53c2f2 Merge branch 'ab/bundle-remove-verbose-option'
Doc update.

* ab/bundle-remove-verbose-option:
  bundle: remove ignored & undocumented "--verbose" flag
2021-10-03 21:49:20 -07:00
65351024eb Merge branch 'ab/auto-depend-with-pedantic'
Improve build procedure for developers.

* ab/auto-depend-with-pedantic:
  Makefile: make COMPUTE_HEADER_DEPENDENCIES=auto work with DEVOPTS=pedantic
2021-10-03 21:49:20 -07:00
2498121cd6 Merge branch 'ab/make-clean-depend-dirs'
"make clean" has been updated to remove leftover .depend/
directories, even when it is not told to use them to compute header
dependencies.

* ab/make-clean-depend-dirs:
  Makefile: clean .depend dirs under COMPUTE_HEADER_DEPENDENCIES != yes
2021-10-03 21:49:19 -07:00
cbb1ae05d5 Merge branch 'ds/perf-test-built-path-fix'
Perf test fix.

* ds/perf-test-built-path-fix:
  t/perf/run: fix bin-wrappers computation
2021-10-03 21:49:19 -07:00
58e2bc452b Merge branch 'jk/http-redact-fix'
Sensitive data in the HTTP trace were supposed to be redacted, but
we failed to do so in HTTP/2 requests.

* jk/http-redact-fix:
  http: match headers case-insensitively when redacting
2021-10-03 21:49:19 -07:00
976d3f00d6 Merge branch 'bs/ls-files-opt-help-text-update'
Help text for "ls-files" options have been updated.

* bs/ls-files-opt-help-text-update:
  ls-files: use imperative mood for -X and -z option description
2021-10-03 21:49:19 -07:00
6a4f5dadd3 Merge branch 'da/difftool-dir-diff-symlink-fix'
"git difftool --dir-diff" mishandled symbolic links.

* da/difftool-dir-diff-symlink-fix:
  difftool: fix symlink-file writing in dir-diff mode
2021-10-03 21:49:19 -07:00
92382d14cd Merge branch 'hn/refs-errno-cleanup'
Futz with the way 'errno' is relied on in the refs API to carry the
failure modes up the call chain.

* hn/refs-errno-cleanup:
  refs: make errno output explicit for read_raw_ref_fn
  refs/files-backend: stop setting errno from lock_ref_oid_basic
  refs: remove EINVAL errno output from specification of read_raw_ref_fn
  refs file backend: move raceproof_create_file() here
2021-10-03 21:49:18 -07:00
842d45d293 Merge branch 'ab/refs-files-cleanup'
Continued work on top of the hn/refs-errno-cleanup topic.

* ab/refs-files-cleanup:
  refs/files: remove unused "errno != ENOTDIR" condition
  refs/files: remove unused "errno == EISDIR" code
  refs/files: remove unused "oid" in lock_ref_oid_basic()
  refs API: remove OID argument to reflog_expire()
  reflog expire: don't lock reflogs using previously seen OID
  refs/files: add a comment about refs_reflog_exists() call
  refs: make repo_dwim_log() accept a NULL oid
  refs/debug: re-indent argument list for "prepare"
  refs/files: remove unused "skip" in lock_raw_ref() too
  refs/files: remove unused "extras/skip" in lock_ref_oid_basic()
  refs: drop unused "flags" parameter to lock_ref_oid_basic()
  refs/files: remove unused REF_DELETING in lock_ref_oid_basic()
  refs/packet: add missing BUG() invocations to reflog callbacks
2021-10-03 21:49:18 -07:00
1030daecda Merge branch 'cb/cvsserver'
"git cvsserver" had a long-standing bug in its authentication code,
which has finally been corrected (it is unclear and is a separate
question if anybody is seriously using it, though).

* cb/cvsserver:
  Documentation: cleanup git-cvsserver
  git-cvsserver: protect against NULL in crypt(3)
  git-cvsserver: use crypt correctly to compare password hashes
2021-10-03 21:49:17 -07:00
df9c83bdf7 Merge branch 'jx/ci-l10n'
CI help for l10n.

* jx/ci-l10n:
  ci: new github-action for git-l10n code review
2021-10-03 21:49:17 -07:00
ac162a606b Merge branch 'jk/clone-unborn-head-in-bare'
"git clone" from a repository whose HEAD is unborn into a bare
repository didn't follow the branch name the other side used, which
is corrected.

* jk/clone-unborn-head-in-bare:
  clone: handle unborn branch in bare repos
2021-10-03 21:49:17 -07:00
4a6fd7d3c7 Merge branch 'en/stash-df-fix'
"git stash", where the tentative change involves changing a
directory to a file (or vice versa), was confused, which has been
corrected.

* en/stash-df-fix:
  stash: restore untracked files AFTER restoring tracked files
  stash: avoid feeding directories to update-index
  t3903: document a pair of directory/file bugs
2021-10-03 21:49:16 -07:00
324efc90d1 builtin/repack.c: pass --refs-snapshot when writing bitmaps
To prevent the race described in an earlier patch, generate and pass a
reference snapshot to the multi-pack bitmap code, if we are writing one
from `git repack`.

This patch is mostly limited to creating a temporary file, and then
calling for_each_ref(). Except we try to minimize duplicates, since
doing so can drastically reduce the size in network-of-forks style
repositories. In the kernel's fork network (the repository containing
all objects from the kernel and all its forks), deduplicating the
references drops the snapshot size from 934 MB to just 12 MB.

But since we're handling duplicates in this way, we have to make sure
that we preferred references (those listed in pack.preferBitmapTips)
before non-preferred ones (to avoid recording an object which is pointed
at by a preferred tip as non-preferred).

We accomplish this by doing separate passes over the references: first
visiting each prefix in pack.preferBitmapTips, and then over the rest of
the references.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 16:40:09 -07:00
273c9c5777 bisect--helper: add space between colon and following sentence
Add missing space between colon sentence (`bisect-run failed:`) and the
following sentence (`git bisect--helper --bisect-state`).

Fixes: d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13)
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:47:53 -07:00
38c356aad6 blame: describe default output format
While there is descriptions for porcelain and incremental output
formats, the default format isn't described. Describe that format for
the sake of completeness.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:44:32 -07:00
96e41f58fe fsck: report invalid object type-path combinations
Improve the error that's emitted in cases where we find a loose object
we parse, but which isn't at the location we expect it to be.

Before this change we'd prefix the error with a not-a-OID derived from
the path at which the object was found, due to an emergent behavior in
how we'd end up with an "OID" in these codepaths.

Now we'll instead say what object we hashed, and what path it was
found at. Before this patch series e.g.:

    $ git hash-object --stdin -w -t blob </dev/null
    e69de29bb2
    $ mv objects/e6/ objects/e7

Would emit ("[...]" used to abbreviate the OIDs):

    git fsck
    error: hash mismatch for ./objects/e7/9d[...] (expected e79d[...])
    error: e79d[...]: object corrupt or missing: ./objects/e7/9d[...]

Now we'll instead emit:

    error: e69d[...]: hash-path mismatch, found at: ./objects/e7/9d[...]

Furthermore, we'll do the right thing when the object type and its
location are bad. I.e. this case:

    $ git hash-object --stdin -w -t garbage --literally </dev/null
    8315a83d2acc4c174aed59430f9a9c4ed926440f
    $ mv objects/83 objects/84

As noted in an earlier commits we'd simply die early in those cases,
until preceding commits fixed the hard die on invalid object type:

    $ git fsck
    fatal: invalid object type

Now we'll instead emit sensible error messages:

    $ git fsck
    error: 8315[...]: hash-path mismatch, found at: ./objects/84/15[...]
    error: 8315[...]: object is of unknown type 'garbage': ./objects/84/15[...]

In both fsck.c and object-file.c we're using null_oid as a sentinel
value for checking whether we got far enough to be certain that the
issue was indeed this OID mismatch.

We need to add the "object corrupt or missing" special-case to deal
with cases where read_loose_object() will return an error before
completing check_object_signature(), e.g. if we have an error in
unpack_loose_rest() because we find garbage after the valid gzip
content:

    $ git hash-object --stdin -w -t blob </dev/null
    e69de29bb2
    $ chmod 755 objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ echo garbage >>objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ git fsck
    error: garbage at end of loose object 'e69d[...]'
    error: unable to unpack contents of ./objects/e6/9d[...]
    error: e69d[...]: object corrupt or missing: ./objects/e6/9d[...]

There is currently some weird messaging in the edge case when the two
are combined, i.e. because we're not explicitly passing along an error
state about this specific scenario from check_stream_oid() via
read_loose_object() we'll end up printing the null OID if an object is
of an unknown type *and* it can't be unpacked by zlib, e.g.:

    $ git hash-object --stdin -w -t garbage --literally </dev/null
    8315a83d2acc4c174aed59430f9a9c4ed926440f
    $ chmod 755 objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    $ echo garbage >>objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    $ /usr/bin/git fsck
    fatal: invalid object type
    $ ~/g/git/git fsck
    error: garbage at end of loose object '8315a83d2acc4c174aed59430f9a9c4ed926440f'
    error: unable to unpack contents of ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    error: 8315a83d2acc4c174aed59430f9a9c4ed926440f: object corrupt or missing: ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    error: 0000000000000000000000000000000000000000: object is of unknown type 'garbage': ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    [...]

I think it's OK to leave that for future improvements, which would
involve enum-ifying more error state as we've done with "enum
unpack_loose_header_result" in preceding commits. In these
increasingly more obscure cases the worst that can happen is that
we'll get slightly nonsensical or inapplicable error messages.

There's other such potential edge cases, all of which might produce
some confusing messaging, but still be handled correctly as far as
passing along errors goes. E.g. if check_object_signature() returns
and oideq(real_oid, null_oid()) is true, which could happen if it
returns -1 due to the read_istream() call having failed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:01 -07:00
31deb28f5e fsck: don't hard die on invalid object types
Change the error fsck emits on invalid object types, such as:

    $ git hash-object --stdin -w -t garbage --literally </dev/null
    <OID>

From the very ungraceful error of:

    $ git fsck
    fatal: invalid object type
    $

To:

    $ git fsck
    error: <OID>: object is of unknown type 'garbage': <OID_PATH>
    [ other fsck output ]

We'll still exit with non-zero, but now we'll finish the rest of the
traversal. The tests that's being added here asserts that we'll still
complain about other fsck issues (e.g. an unrelated dangling blob).

To do this we need to pass down the "OBJECT_INFO_ALLOW_UNKNOWN_TYPE"
flag from read_loose_object() through to parse_loose_header(). Since
the read_loose_object() function is only used in builtin/fsck.c we can
simply change it to accept a "struct object_info" (which contains the
OBJECT_INFO_ALLOW_UNKNOWN_TYPE in its flags). See
f6371f9210 (sha1_file: add read_loose_object() function, 2017-01-13)
for the introduction of read_loose_object().

Since we'll need a "struct strbuf" to hold the "type_name" let's pass
it to the for_each_loose_file_in_objdir() callback to avoid allocating
a new one for each loose object in the iteration. It also makes the
memory management simpler than sticking it in fsck_loose() itself, as
we'll only need to strbuf_reset() it, with no need to do a
strbuf_release() before each "return".

Before this commit we'd never check the "type" if read_loose_object()
failed, but now we do. We therefore need to initialize it to OBJ_NONE
to be able to tell the difference between e.g. its
unpack_loose_header() having failed, and us getting past that and into
parse_loose_header().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:01 -07:00
dccb32bf01 object-file.c: stop dying in parse_loose_header()
Make parse_loose_header() return error codes and data instead of
invoking die() by itself.

For now we'll move the relevant die() call to loose_object_info() and
read_loose_object() to keep this change smaller. In a subsequent
commit we'll make read_loose_object() return an error code instead of
dying. We should also address the "allow_unknown" case (should be
moved to builtin/cat-file.c), but for now I'll be leaving it.

For making parse_loose_header() not die() change its prototype to
accept a "struct object_info *" instead of the "unsigned long *sizep"
it accepted before. Its callers can now check the populated populated
"oi->typep".

Because of this we don't need to pass in the "unsigned int flags"
which we used for OBJECT_INFO_ALLOW_UNKNOWN_TYPE, we can instead do
that check in loose_object_info().

This also refactors some confusing control flow around the "status"
variable. In some cases we set it to the return value of "error()",
i.e. -1, and later checked if "status < 0" was true.

Since 93cff9a978 (sha1_loose_object_info: return error for corrupted
objects, 2017-04-01) the return value of loose_object_info() (then
named sha1_loose_object_info()) had been a "status" variable that be
any negative value, as we were expecting to return the "enum
object_type".

The only negative type happens to be OBJ_BAD, but the code still
assumed that more might be added. This was then used later in
e.g. c84a1f3ed4 (sha1_file: refactor read_object, 2017-06-21). Now
that parse_loose_header() will return 0 on success instead of the
type (which it'll stick into the "struct object_info") we don't need
to conflate these two cases in its callers.

Since parse_loose_header() doesn't need to return an arbitrary
"status" we only need to treat its "ret < 0" specially, but can
idiomatically overwrite it with our own error() return. This along
with having made unpack_loose_header() return an "enum
unpack_loose_header_result" in an earlier commit means that we can
move the previously nested if/else cases mostly into the "ULHR_OK"
branch of the "switch" statement.

We should be less silent if we reach that "status = -1" branch, which
happens if we've got trailing garbage in loose objects, see
f6371f9210 (sha1_file: add read_loose_object() function, 2017-01-13)
for a better way to handle it. For now let's punt on it, a subsequent
commit will address that edge case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:00 -07:00
5848fb11ac object-file.c: return ULHR_TOO_LONG on "header too long"
Split up the return code for "header too long" from the generic
negative return value unpack_loose_header() returns, and report via
error() if we exceed MAX_HEADER_LEN.

As a test added earlier in this series in t1006-cat-file.sh shows
we'll correctly emit zlib errors from zlib.c already in this case, so
we have no need to carry those return codes further down the
stack. Let's instead just return ULHR_TOO_LONG saying we ran into the
MAX_HEADER_LEN limit, or other negative values for "unable to unpack
<OID> header".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:00 -07:00
3b6a8db3b0 object-file.c: use "enum" return type for unpack_loose_header()
In a preceding commit we changed and documented unpack_loose_header()
from its previous behavior of returning any negative value or zero, to
only -1 or 0.

Let's add an "enum unpack_loose_header_result" type and use it for
these return values, and have the compiler assert that we're
exhaustively covering all of them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:00 -07:00
01cab97679 object-file.c: simplify unpack_loose_short_header()
Combine the unpack_loose_short_header(),
unpack_loose_header_to_strbuf() and unpack_loose_header() functions
into one.

The unpack_loose_header_to_strbuf() function was added in
46f034483e (sha1_file: support reading from a loose object of unknown
type, 2015-05-03).

Its code was mostly copy/pasted between it and both of
unpack_loose_header() and unpack_loose_short_header(). We now have a
single unpack_loose_header() function which accepts an optional
"struct strbuf *" instead.

I think the remaining unpack_loose_header() function could be further
simplified, we're carrying some complexity just to be able to emit a
garbage type longer than MAX_HEADER_LEN, we could alternatively just
say "we found a garbage type <first 32 bytes>..." instead. But let's
leave the current behavior in place for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:00 -07:00
ddb3474b66 object-file.c: make parse_loose_header_extended() public
Make the parse_loose_header_extended() function public and remove the
parse_loose_header() wrapper. The only direct user of it outside of
object-file.c itself was in streaming.c, that caller can simply pass
the required "struct object-info *" instead.

This change is being done in preparation for teaching
read_loose_object() to accept a flag to pass to
parse_loose_header(). It isn't strictly necessary for that change, we
could simply use parse_loose_header_extended() there, but will leave
the API in a better end state.

It would be a better end-state to have already moved the declaration
of these functions to object-store.h to avoid the forward declaration
of "struct object_info" in cache.h, but let's leave that cleanup for
some other time.

1. https://lore.kernel.org/git/patch-v6-09.22-5b9278e7bb4-20210907T104559Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:00 -07:00
bfff2c4833 object-file.c: return -1, not "status" from unpack_loose_header()
Return a -1 when git_inflate() fails instead of whatever Z_* status
we'd get from zlib.c. This makes no difference to any error we report,
but makes it more obvious that we don't care about the specific zlib
error codes here.

See d21f842690 (unpack_sha1_header(): detect malformed object header,
2016-09-25) for the commit that added the "return status" code. As far
as I can tell there was never a real reason (e.g. different reporting)
for carrying down the "status" as opposed to "-1".

At the time that d21f842690 was written there was a corresponding
"ret < Z_OK" check right after the unpack_sha1_header() call (the
"unpack_sha1_header()" function was later rename to our current
"unpack_loose_header()").

However, that check was removed in c84a1f3ed4 (sha1_file: refactor
read_object, 2017-06-21) without changing the corresponding return
code.

So let's do the minor cleanup of also changing this function to return
a -1.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:00 -07:00
74ad250a1c object-file.c: don't set "typep" when returning non-zero
When the loose_object_info() function returns an error stop faking up
the "oi->typep" to OBJ_BAD. Let the return value of the function
itself suffice. This code cleanup simplifies subsequent changes.

That we set this at all is a relic from the past. Before
052fe5eaca (sha1_loose_object_info: make type lookup optional,
2013-07-12) we would always return the type_from_string(type) via the
parse_sha1_header() function, or -1 (i.e. OBJ_BAD) if we couldn't
parse it.

Then in a combination of 46f034483e (sha1_file: support reading from
a loose object of unknown type, 2015-05-03) and
b3ea7dd32d (sha1_loose_object_info: handle errors from
unpack_sha1_rest, 2017-10-05) our API drifted even further towards
conflating the two again.

Having read the code paths involved carefully I think this is OK. We
are just about to return -1, and we have only one caller:
do_oid_object_info_extended(). That function will in turn go on to
return -1 when we return -1 here.

This might be introducing a subtle bug where a caller of
oid_object_info_extended() would inspect its "typep" and expect a
meaningful value if the function returned -1.

Such a problem would not occur for its simpler oid_object_info()
sister function. That one always returns the "enum object_type", which
in the case of -1 would be the OBJ_BAD.

Having read the code for all the callers of these functions I don't
believe any such bug is being introduced here, and in any case we'd
likely already have such a bug for the "sizep" member (although
blindly checking "typep" first would be a more common case).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:00 -07:00
dd45a56246 cat-file tests: test for current --allow-unknown-type behavior
Add more tests for the current --allow-unknown-type behavior. As noted
in [1] I don't think much of this makes sense, but let's test for it
as-is so we can see if the behavior changes in the future.

1. https://lore.kernel.org/git/87r1i4qf4h.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:00 -07:00
7e7d220d9d cat-file tests: add corrupt loose object test
Fix a blindspot in the tests for "cat-file" (and by proxy, the guts of
object-file.c) by testing that when we can't decode a loose object
with zlib we'll emit an error from zlib.c.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:05:59 -07:00
59b8283d55 cat-file tests: test for missing/bogus object with -t, -s and -p
When we look up a missing object with cat_one_file() what error we
print out currently depends on whether we'll error out early in
get_oid_with_context(), or if we'll get an error later from
oid_object_info_extended().

The --allow-unknown-type flag then changes whether we pass the
"OBJECT_INFO_ALLOW_UNKNOWN_TYPE" flag to get_oid_with_context() or
not.

The "-p" flag is yet another special-case in printing the same output
on the deadbeef OID as we'd emit on the deadbeef_short OID for the
"-s" and "-t" options, it also doesn't support the
"--allow-unknown-type" flag at all.

Let's test the combination of the two sets of [-t, -s, -p] and
[--{no-}allow-unknown-type] (the --no-allow-unknown-type is implicit
in not supplying it), as well as a [missing,bogus] object pair.

This extends tests added in 3e370f9faf (t1006: add tests for git
cat-file --allow-unknown-type, 2015-05-03).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:05:59 -07:00
70e4a57762 cat-file tests: move bogus_* variable declarations earlier
Change the short/long bogus bogus object type variables into a form
where the two sets can be used concurrently. This'll be used by
subsequently added tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:05:59 -07:00
a5ed333121 fsck tests: test for garbage appended to a loose object
There wasn't any output tests for this scenario, let's ensure that we
don't regress on it in the changes that come after this.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:05:59 -07:00
42cd635b21 fsck tests: test current hash/type mismatch behavior
If fsck we move an object around between .git/objects/?? directories
to simulate a hash mismatch "git fsck" will currently hard die() in
object-file.c. This behavior will be fixed in subsequent commits, but
let's test for it as-is for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:05:59 -07:00
f7a0dba7a2 fsck tests: refactor one test to use a sub-repo
Refactor one of the fsck tests to use a throwaway repository. It's a
pervasive pattern in t1450-fsck.sh to spend a lot of effort on the
teardown of a tests so we're not leaving corrupt content for the next
test.

We can instead use the pattern of creating a named sub-repository,
then we don't have to worry about cleaning up after ourselves, nobody
will care what state the broken "hash-mismatch" repository is after
this test runs.

See [1] for related discussion on various "modern" test patterns that
can be used to avoid verbosity and increase reliability.

1. https://lore.kernel.org/git/87y27veeyj.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:05:59 -07:00
093fffdfbe fsck tests: add test for fsck-ing an unknown type
Fix a blindspot in the fsck tests by checking what we do when we
encounter an unknown "garbage" type produced with hash-object's
--literally option.

This behavior needs to be improved, which'll be done in subsequent
patches, but for now let's test for the current behavior.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:05:59 -07:00
59580685be config.h: remove unused git_config_get_untracked_cache() declaration
This function was removed in ad0fb65999 (repo-settings: parse
core.untrackedCache, 2019-08-13), but not its corresponding *.h entry.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 14:39:46 -07:00
067e73c8ae log-tree.h: remove unused function declarations
The init_log_tree_opt() and log_tree_opt_parse() functions were
removed in cd2bdc5309 (Common option parsing for "git log --diff" and
friends, 2006-04-14), but not their corresponding *.h declaration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 14:39:46 -07:00
1fd2aa543d grep.h: remove unused grep_threads_ok() declaration
This function was removed in 0579f91dd7 (grep: enable threading with
-p and -W using lazy attribute lookup, 2011-12-12), but not its
corresponding *.h declaration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 14:39:46 -07:00
f787ebd51c builtin.h: remove cmd_tar_tree() declaration
The cmd_tar_tree() function itself was removed in
925ceccf05 (tar-tree: remove deprecated command, 2013-11-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 14:39:46 -07:00
0000e81811 builtin/remote.c: add and use SHOW_INFO_INIT
In the preceding commit we introduced REF_STATES_INIT, but did not
change the "struct show_info" to have a corresponding
initializer. Let's do that, and make it use "REF_STATES_INIT" and
"STRING_LIST_INIT_DUP", doing that requires changing "list" and
"states" away from being pointers.

The resulting end-state is simpler since we omit the local "info_list"
and "states" variables in show() as well as the memset().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 14:22:51 -07:00
0bc7787ca9 builtin/remote.c: add and use a REF_STATES_INIT
Use a new REF_STATES_INIT designated initializer instead of assigning
to the "strdup_strings" member of the previously memzero()'d version
of this struct.

The pattern of assigning to "strdup_strings" dates back to
211c89682e (Make git-remote a builtin, 2008-02-29) (when it was
"strdup_paths"), i.e. long before we used anything like our current
established *_INIT patterns consistently.

Then in e61e0cc6b7 (builtin-remote: teach show to display remote
HEAD, 2009-02-25) and e5dcbfd9ab (builtin-remote: new show output
style for push refspecs, 2009-02-25) we added some more of these.

As it turns out we only initialized this struct three times, all the
other uses were of pointers to those initialized structs. So let's
initialize it in those three places, skip the memset(), and pass those
structs down appropriately.

This would be a behavior change if we had codepaths that relied say on
implicitly having had "new_refs" initialized to STRING_LIST_INIT_NODUP
with the memset(), but only set the "strdup_strings" on some other
struct, but then called string_list_append() on "new_refs". There
isn't any such codepath, all of the late assignments to
"strdup_strings" assigned to those structs that we'd use for those
codepaths.

So just initializing them all up-front makes for easier to understand
code, i.e. in the pre-image it looked as though we had that tricky
edge case, but we didn't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 14:22:51 -07:00
73ee449bbf urlmatch.[ch]: add and use URLMATCH_CONFIG_INIT
Change the initialization pattern of "struct urlmatch_config" to use
an *_INIT macro and designated initializers. Right now there's no
other "struct" member of "struct urlmatch_config" which would require
its own *_INIT, but it's good practice not to assume that. Let's also
change this to a designated initializer while we're at it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 14:22:51 -07:00
afc72b5d3a mergesort: use ranks stack
The bottom-up mergesort implementation needs to skip through sublists a
lot.  A recursive version could avoid that, but would require log2(n)
stack frames.  Explicitly manage a stack of sorted sublists of various
lengths instead to avoid fast-forwarding while also keeping a lid on
memory usage.

While this patch was developed independently, a ranks stack is also used
in https://github.com/mono/mono/blob/master/mono/eglib/sort.frag.h by
the Mono project.

The idea is to keep slots for log2(n_max) sorted sublists, one for each
power of 2.  Such a construct can accommodate lists of any length up to
n_max.  Since there is a known maximum number of items (effectively
SIZE_MAX), we can preallocate the whole rank stack.

We add items one by one, which is akin to incrementing a binary number.
Make use of that by keeping track of the number of items and check bits
in it instead of checking for NULL in the rank stack when checking if a
sublist of a certain rank exists, in order to avoid memory accesses.

The first item can go into the empty first slot as a sublist of length
2^0.  The second one needs to be merged with the previous sublist and
the result goes into the empty second slot as a sublist of length 2^1.
The third one goes into vacated first slot and so on.  At the end we
merge all the sublists to get the result.

The new version still performs a stable sort by making sure to put items
seen earlier first when the compare function indicates equality.  That's
done by preferring items from sublists with a higher rank.

The new merge function also tries to minimize the number of operations.
Like blame.c::blame_merge(), the function doesn't set the next pointer
if it already points to the right item, and it exits when it reaches the
end of one of the two sublists that it's given.  The old code couldn't
do the latter because it kept all items in a single list.

The number of comparisons stays the same, though.  Here's example output
of "test-tool mergesort test" for the rand distributions with the most
number of comparisons with the ranks stack:

   $ t/helper/test-tool mergesort test | awk '
       NR > 1 && $1 != "rand" {next}
       $7 > max[$3] {max[$3] = $7; line[$3] = $0}
       END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
rand      copy                  100       32      669      420      569 OK
rand      dither               1023       64     9997     5396     8974 OK
rand      dither               1024      512    10007     6159     8983 OK
rand      dither               1025      256    10993     5988     9968 OK

Here are the differences to the results without this patch:

distribut mode                    n        m get_next set_next  compare
rand      copy                  100       32     -515     -280        0
rand      dither               1023       64    -6376    -4834        0
rand      dither               1024      512    -6377    -4081        0
rand      dither               1025      256    -7461    -5287        0

The numbers of get_next and set_next calls are reduced significantly.

NB: These winners are different than the ones shown in the patch that
introduced the unriffle mode because the addition of the unriffle_skewed
mode in between changed the consumption of rand() values.

Here are the distributions with the most comparisons overall with the
ranks stack:

   $ t/helper/test-tool mergesort test | awk '
       $7 > max[$3] {max[$3] = $7; line[$3] = $0}
       END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
sawtooth  unriffle_skewed       100      128      689      632      589 OK
sawtooth  unriffle_skewed      1023     1024    10230    10220     9207 OK
sawtooth  unriffle             1024     1024    10241    10240     9217 OK
sawtooth  unriffle_skewed      1025     2048    11266    10242    10241 OK

And here the differences to before:

distribut mode                    n        m get_next set_next  compare
sawtooth  unriffle_skewed       100      128     -495      -68        0
sawtooth  unriffle_skewed      1023     1024    -6143      -10        0
sawtooth  unriffle             1024     1024    -6143        0        0
sawtooth  unriffle_skewed      1025     2048    -7188    -1033        0

We get a similar reduction of get_next calls here, but only a slight
reduction of set_next calls, if at all.

And here are the results of p0071-sort.sh before:

0071.12: llist_mergesort() unsorted    0.36(0.33+0.01)
0071.14: llist_mergesort() sorted      0.15(0.13+0.01)
0071.16: llist_mergesort() reversed    0.16(0.14+0.01)

... and here the ones with this patch:

0071.12: llist_mergesort() unsorted    0.24(0.22+0.01)
0071.14: llist_mergesort() sorted      0.12(0.10+0.01)
0071.16: llist_mergesort() reversed    0.12(0.10+0.01)

NB: We can't use t/perf/run to compare revisions in one run because it
uses the test-tool from the worktree, not from the revisions being
tested.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 12:43:09 -07:00
40bc872adb p0071: test performance of llist_mergesort()
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 12:43:09 -07:00
84edc40676 p0071: measure sorting of already sorted and reversed files
Check if sorting takes advantage of already sorted or reversed content,
or if that corner case actually decreases performance, like it would for
a simplistic quicksort implementation.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 12:43:09 -07:00
f1ed4ce9e3 test-mergesort: add unriffle_skewed mode
Add a mode that turns a sorted list into adversarial input for a
bottom-up mergesort implementation that doubles the length of sorted
sublists at each level -- like our llist_mergesort().

While unriffle mode splits the list in half at each recursion step,
unriffle_skewed splits it into 2^l items and the rest, with 2^l being
the highest power of two smaller than the number of items and thus
2^l >= rest.  The rest is unriffled with the tail of the first half to
require a merge to compare the maximum number of elements.

It complements the unriffle mode, which targets balanced merges.  If
the number of elements is a power of two then both actually produce the
same result, as 2^l == rest == n/2 at each recursion step in that case.

Here are the results:

   $ t/helper/test-tool mergesort test | awk '
      $7 > max[$3] {max[$3] = $7; line[$3] = $0}
      END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
sawtooth  unriffle_skewed       100      128     1184      700      589 OK
sawtooth  unriffle_skewed      1023     1024    16373    10230     9207 OK
sawtooth  unriffle             1024     1024    16384    10240     9217 OK
sawtooth  unriffle_skewed      1025     2048    18454    11275    10241 OK

The sawtooth distribution with m>=n produces a sorted list and
unriffle_skewed mode turns it into adversarial input for unbalanced
merges, which it wins in all cases except for n=1024 -- the resulting
list is the same, but unriffle is tested before unriffle_skewed, so its
result is selected by the AWK script.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 12:43:09 -07:00
1aa589922b test-mergesort: add unriffle mode
Add a mode that turns sorted items into adversarial input for mergesort.
Do that by running mergesort in reverse and rearranging the items in
such a way that each merge needs the maximum number of operations to
undo it.

To riffle is a card shuffling technique and involves splitting a deck
into two and then to interleave them.  A perfect riffle takes one card
from each half in turn.  That's similar to the most expensive merge,
which has to take one item from each sublist in turn, which requires the
maximum number of comparisons (n-1).

So unriffle does that in reverse, i.e. it generates the first sublist
out of the items at even indexes and the second sublist out of the items
at odd indexes, without changing their order in any other way.  Done
recursively until we reach the trivial sublist length of one, this
twists the list into an order that requires the maximum effort for
mergesort to untangle.

As a baseline, here are the rand distributions with the highest number
of comparisons from "test-tool mergesort test":

   $ t/helper/test-tool mergesort test | awk '
      NR > 1 && $1 != "rand" {next}
      $7 > max[$3] {max[$3] = $7; line[$3] = $0}
      END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
rand      copy                  100       32     1184      700      569 OK
rand      reverse_1st_half     1023      256    16373    10230     8976 OK
rand      reverse_1st_half     1024      512    16384    10240     8993 OK
rand      dither               1025       64    18454    11275     9970 OK

And here are the most expensive ones overall:

   $ t/helper/test-tool mergesort test | awk '
      $7 > max[$3] {max[$3] = $7; line[$3] = $0}
      END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
stagger   reverse               100       64     1184      700      580 OK
sawtooth  unriffle             1023     1024    16373    10230     9179 OK
sawtooth  unriffle             1024     1024    16384    10240     9217 OK
stagger   unriffle             1025     2048    18454    11275    10241 OK

The sawtooth distribution with m>=n generates a sorted list.  The
unriffle mode is designed to turn that into adversarial input for
mergesort, and that checks out for n=1023 and n=1024, where it produces
the list that requires the most comparisons.

Item counts that are not powers of two have other winners, and that's
because unriffle recursively splits lists into equal-sized halves, while
llist_mergesort() splits them into the biggest power of two smaller than
n and the rest, e.g. for n=1025 it sorts the first 1024 separately and
finally merges them to the last item.

So unriffle mode works as designed for the intended use case, but to
consistently generate adversarial input for unbalanced merges we need
something else.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 12:43:08 -07:00
0cecb75531 test-mergesort: add generate subcommand
Add a subcommand for printing test data.  It can be used to generate
special test cases and feed them into the sort subcommand or sort(1) for
performance measurements.  It may also be useful to illustrate the
effect of distributions, modes and their parameters.

It generates n integers with the specified distribution and its
distribution-specific parameter m.  E.g. m is the maximum value for
the plateau distribution and the length and height of individual teeth
of the sawtooth distribution.

The generated values are printed as zero-padded eight-digit hexadecimal
numbers to make sure alphabetic and numeric order are the same.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 12:43:08 -07:00
e031e9719d test-mergesort: add test subcommand
Adapt the qsort certification program from "Engineering a Sort Function"
by Bentley and McIlroy for testing our linked list sort function.  It
generates several lists with various distribution patterns and counts
the number of operations llist_mergesort() needs to order them.  It
compares the result to the output of a trusted sort function (qsort(1))
and also checks if the sort is stable.

Also add a test script that makes use of the new subcommand.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 12:43:08 -07:00
d536a71169 test-mergesort: add sort subcommand
Give the code for sorting a text file its own sub-command.  This allows
extending the helper, which we'll do in the following patches.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 12:43:08 -07:00
2e6701017e test-mergesort: use strbuf_getline()
Strip line ending characters to make sure empty lines are sorted like
sort(1) does.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 12:43:08 -07:00
28c10ecbfc difftool: add a missing space to the run_dir_diff() comments
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-30 18:48:51 -07:00
8e2af8f0db difftool: remove an unnecessary call to strbuf_release()
The `buf` strbuf is reused again later in the same function, so there
is no benefit to calling strbuf_release(). The subsequent usage is
already using strbuf_reset() to reset the buffer, so releasing it
early is only going to lead to a wasteful reallocation.

Remove the early call to strbuf_release(). The same strbuf is already
cleaned up in the "finish:" section so nothing is leaked, either.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-30 18:48:51 -07:00
2255c80c91 difftool: refactor dir-diff to write files using helper functions
Add a helpers function to handle the unlinking and writing
of the dir-diff submodule and symlink stand-in files.

Use the helpers to implement the guts of the hashmap loops.
This eliminate duplicate code and safeguards the submodules
hashmap loop against the symlink-chasing behavior that 5bafb3576a
(difftool: fix symlink-file writing in dir-diff mode, 2021-09-22)
addressed.

The submodules loop should not strictly require the unlink() call that
this is introducing to them, but it does not necessarily hurt them
either beyond the cost of the extra unlink().

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-30 18:48:51 -07:00
4ac9f15492 difftool: create a tmpdir path without repeated slashes
The paths generated by difftool are passed to user-facing diff tools.
Using paths with repeated slashes in them is a cosmetic blemish that
is exposed to users and can be avoided.

Use a strbuf to create the buffer used for the dir-diff tmpdir.
Strip trailing slashes from the value read from TMPDIR to avoid
repeated slashes in the generated paths.

Adjust the error handling to avoid leaking strbufs and to avoid
returning -1 to cmd_main().

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-30 18:48:51 -07:00
3f566c4e69 grep: refactor next_match() and match_one_pattern() for external use
These changes are made in preparation of, the colorization support for the
"git log" subcommands that, rely on regex functionality (i.e. "--author",
"--committer" and "--grep"). These changes are necessary primarily because
match_one_pattern() expects header lines to be prefixed, however, in
pretty, the prefixes are stripped from the lines because the name-email
pairs need to go through additional parsing, before they can be printed and
because next_match() doesn't handle the case of
"ctx == GREP_CONTEXT_HEAD" at all. So, teach next_match() how to handle the
new case and move match_one_pattern()'s core logic to
headerless_match_one_pattern() while preserving match_one_pattern()'s uses
that depend on the additional processing.

Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-29 13:23:11 -07:00
45bde58ef8 grep: demonstrate bug with textconv attributes and submodules
In some circumstances, "git grep --textconv --recurse-submodules"
ignores the textconv attributes from the submodules and erroneously
applies the attributes defined in the superproject on the submodules'
files. The textconv cache is also saved on the superproject, even for
submodule objects.

A fix for these problems will probably require at least three changes:

- Some textconv and attributes functions (as well as their callees) will
  have to be adjusted to work with arbitrary repositories. Note that
  "fill_textconv()", for example, already receives a "struct repository"
  but it writes the textconv cache using "write_loose_object()", which
  implicitly works on "the_repository".

- grep.c functions will have to call textconv/userdiff routines passing
  the "repo" field from "struct grep_source" instead of the one from
  "struct grep_opt". The latter always points to "the_repository" on
  "git grep" executions (see its initialization in builtin/grep.c), but
  the former points to the correct repository that each source (an
  object, file, or buffer) comes from.

- "userdiff_find_by_path()" might need to use a different attributes
  stack for each repository it works on or reset its internal static
  stack when the repository is changed throughout the calls.

For now, let's add some tests to demonstrate these problems, and also
update a NEEDSWORK comment in grep.h that mentions this bug to reference
the added tests.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-29 13:19:38 -07:00
6d08b9d4ca builtin/repack.c: make largest pack preferred
When repacking into a geometric series and writing a multi-pack bitmap,
it is beneficial to have the largest resulting pack be the preferred
object source in the bitmap's MIDX, since selecting the large packs can
lead to fewer broken delta chains and better compression.

Teach 'git repack' to identify this pack and pass it to the MIDX write
machinery in order to mark it as preferred.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 21:20:56 -07:00
1d89d88d37 builtin/repack.c: support writing a MIDX while repacking
Teach `git repack` a new `--write-midx` option for callers that wish to
persist a multi-pack index in their repository while repacking.

There are two existing alternatives to this new flag, but they don't
cover our particular use-case. These alternatives are:

  - Call 'git multi-pack-index write' after running 'git repack', or

  - Set 'GIT_TEST_MULTI_PACK_INDEX=1' in your environment when running
    'git repack'.

The former works, but introduces a gap in bitmap coverage between
repacking and writing a new MIDX (since the repack may have deleted a
pack included in the existing MIDX, invalidating it altogether).

Setting the 'GIT_TEST_' environment variable is obviously unsupported.
In fact, even if it were supported officially, it still wouldn't work,
because it generates the MIDX *after* redundant packs have been dropped,
leading to the same issue as above.

Introduce a new option which eliminates this race by teaching `git
repack` to generate the MIDX at the critical point: after the new packs
have been written and moved into place, but before the redundant packs
have been removed.

This option is compatible with `git repack`'s '--bitmap' option (it
changes the interpretation to be: "write a bitmap corresponding to the
MIDX after one has been generated").

There is a little bit of additional noise in the patch below to avoid
repeating ourselves when selecting which packs to delete. Instead of a
single loop as before (where we iterate over 'existing_packs', decide if
a pack is worth deleting, and if so, delete it), we have two loops (the
first where we decide which ones are worth deleting, and the second
where we actually do the deleting). This makes it so we have a single
check we can make consistently when (1) telling the MIDX which packs we
want to exclude, and (2) actually unlinking the redundant packs.

There is also a tiny change to short-circuit the body of
write_midx_included_packs() when no packs remain in the case of an empty
repository. The MIDX code does not handle this, so avoid trying to
generate a MIDX covering zero packs in the first place.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 21:20:56 -07:00
5f18e31f46 builtin/repack.c: extract showing progress to a variable
We only ask whether stderr is a tty before calling
'prune_packed_objects()', but the subsequent patch will add another use.

Extract this check into a variable so that both can use it without
having to call 'isatty()' twice.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 21:20:56 -07:00
a169166d2b builtin/repack.c: rename variables that deal with non-kept packs
The new variable `existing_kept_packs` (and corresponding parameter
`fname_kept_list`) added by the previous patch make it seem like
`existing_packs` and `fname_list` are each subsets of the other two
respectively.

In reality, each pair is disjoint: one stores the packs without .keep
files, and the other stores the packs with .keep files. Rename each to
more clearly reflect this.

Suggested-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 21:20:56 -07:00
90f838bc36 builtin/repack.c: keep track of existing packs unconditionally
In order to be able to write a multi-pack index during repacking, `git
repack` must keep track of which packs it wants to write into the MIDX.
This set is the union of existing packs which will not be deleted,
new pack(s) generated as a result of the repack, and .keep packs.

Prior to this patch, `git repack` populated the list of existing packs
only when repacking all-into-one (i.e., with `-A` or `-a`), but we will
soon need to know this list when repacking when writing a MIDX without
a-i-o.

Populate the list of existing packs unconditionally, and guard removing
packs from that list only when repacking a-i-o.

Additionally, keep track of filenames of kept packs separately, since
this, too, will be used in an upcoming patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 21:20:56 -07:00
08944d1c22 midx: preliminary support for --refs-snapshot
To figure out which commits we can write a bitmap for, the multi-pack
index/bitmap code does a reachability traversal, marking any commit
which can be found in the MIDX as eligible to receive a bitmap.

This approach will cause a problem when multi-pack bitmaps are able to
be generated from `git repack`, since the reference tips can change
during the repack. Even though we ignore commits that don't exist in
the MIDX (when doing a scan of the ref tips), it's possible that a
commit in the MIDX reaches something that isn't.

This can happen when a multi-pack index contains some pack which refers
to loose objects (e.g., if a pack was pushed after starting the repack
but before generating the MIDX which depends on an object which is
stored as loose in the repository, and by definition isn't included in
the multi-pack index).

By taking a snapshot of the references before we start repacking, we can
close that race window. In the above scenario (where we have a packed
object pointing at a loose one), we'll either (a) take a snapshot of the
references before seeing the packed one, or (b) take it after, at which
point we can guarantee that the loose object will be packed and included
in the MIDX.

This patch does just that. It writes a temporary "reference snapshot",
which is a list of OIDs that are at the ref tips before writing a
multi-pack bitmap. References that are "preferred" (i.e,. are a suffix
of at least one value of the 'pack.preferBitmapTips' configuration) are
marked with a special '+'.

The format is simple: one line per commit at each tip, with an optional
'+' at the beginning (for preferred references, as described above).

When provided, the reference snapshot is used to drive bitmap selection
instead of the MIDX code doing its own traversal. When it isn't
provided, the usual traversal takes place instead.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 21:20:56 -07:00
6fb22ca463 builtin/multi-pack-index.c: support --stdin-packs mode
To power a new `--write-midx` mode, `git repack` will want to write a
multi-pack index containing a certain set of packs in the repository.

This new option will be used by `git repack` to write a MIDX which
contains only the packs which will survive after the repack (that is, it
will exclude any packs which are about to be deleted).

This patch effectively exposes the function implemented in the previous
commit via the `git multi-pack-index` builtin. An alternative approach
would have been to call that function from the `git repack` builtin
directly, but this introduces awkward problems around closing and
reopening the object store, so the MIDX will be written out-of-process.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 21:20:55 -07:00
56d863e979 midx: expose write_midx_file_only() publicly
Expose a variant of the write_midx_file() function which ignores packs
that aren't included in an explicit "allow" list.

This will be used in an upcoming patch to power a new `--stdin-packs`
mode of `git multi-pack-index write` for callers that only want to
include certain packs in a MIDX (and ignore any packs which may have
happened to enter the repository independently, e.g., from pushes).

Those patches will provide test coverage for this new function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 21:20:55 -07:00
ebd2e4a13a Makefile: restrict -Wpedantic and -Wno-pedantic-ms-format better
6a8cbc41ba (developer: enable pedantic by default, 2021-09-03)
enables pedantic mode in as many compilers as possible to help gather
feedback on future tightening, so lets do so.

-Wpedantic is missing in some really old gcc 4 versions so lets restrict
it to gcc5 and clang4 (it does work in clang3 AFAIK, but it will be
unlikely that a developer will use such an old compiler anyway).

MinGW gcc is the only one which has -Wno-pedantic-ms-format, and while
that is available also in older compilers, the Windows SDK provides gcc10
so lets aim for that.

Note that in order to target the flag to only Windows, additional changes
were needed in config.mak.uname to propagate the OS detection which also
did some minor refactoring, but which is functionaly equivalent.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 21:15:53 -07:00
3b723f722d parse-options.h: move PARSE_OPT_SHELL_EVAL between enums
Fix a bad landmine of a bug which has been with us ever since
PARSE_OPT_SHELL_EVAL was added in 47e9cd28f8 (parseopt: wrap
rev-parse --parseopt usage for eval consumption, 2010-06-12).

It's an argument to parse_options() and should therefore be in "enum
parse_opt_flags", but it was added to the per-option "enum
parse_opt_option_flags" by mistake.

Therefore as soon as we'd have an enum member in the former that
reached its value of "1 << 8" we'd run into a seemingly bizarre bug
where that new option would turn on the unrelated PARSE_OPT_SHELL_EVAL
in "git rev-parse --parseopt" by proxy.

I manually checked that no other enum members suffered from such
overlap, by setting the values to non-overlapping values, and making
the relevant codepaths BUG() out if the given value was above/below
the expected (excluding flags=0 in the case of "enum
parse_opt_flags").

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 16:50:42 -07:00
6ffb990dc4 doc: fix capitalization in "git status --porcelain=v2" description
The summary line had xy, while the description (and other sub-sections)
has XY.

Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 16:29:04 -07:00
b6b210c5e1 Merge branch 'jk/ref-paranoia' into jt/no-abuse-alternate-odb-for-submodules
* jk/ref-paranoia: (71 commits)
  refs: drop "broken" flag from for_each_fullref_in()
  ref-filter: drop broken-ref code entirely
  ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN
  repack, prune: drop GIT_REF_PARANOIA settings
  refs: turn on GIT_REF_PARANOIA by default
  refs: omit dangling symrefs when using GIT_REF_PARANOIA
  refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag
  refs-internal.h: reorganize DO_FOR_EACH_* flag documentation
  refs-internal.h: move DO_FOR_EACH_* flags next to each other
  t5312: be more assertive about command failure
  t5312: test non-destructive repack
  t5312: create bogus ref as necessary
  t5312: drop "verbose" helper
  t5600: provide detached HEAD for corruption failures
  t5516: don't use HEAD ref for invalid ref-deletion tests
  t7900: clean up some more broken refs
  The eighth batch
  t0000: avoid masking git exit value through pipes
  tree-diff: fix leak when not HAVE_ALLOCA_H
  pack-revindex.h: correct the time complexity descriptions
  ...
2021-09-28 15:15:42 -07:00
750036c8f7 refs/ref-cache.[ch]: remove "incomplete" from create_dir_entry()
Remove the now-unused "incomplete" parameter from create_dir_entry(),
all its callers specify it as "1", so let's drop the "incomplete=0"
case. The last caller to use it was search_for_subdir(), but that code
was removed in the preceding commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 15:12:04 -07:00
5e4546d599 refs/ref-cache.c: remove "mkdir" parameter from find_containing_dir()
Remove the "mkdir" parameter from the find_containing_dir() function,
the add_ref_entry() function removed in the preceding commit was its
last user.

Since "mkdir" is always "0" we can also remove the parameter from
search_for_subdir(), which in turn means that we can delete most of
that function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 15:12:04 -07:00
6a99fa2e9e refs/ref-cache.[ch]: remove unused add_ref_entry()
This function has not been used since 9dd389f3d8 (packed_ref_store:
get rid of the `ref_cache` entirely, 2017-09-25).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 15:12:04 -07:00
34e8a20d76 refs/ref-cache.[ch]: remove unused remove_entry_from_dir()
This function was missed in 9939b33d6a (packed-backend: rip out some
now-unused code, 2017-09-08), and has been orphaned since then. Let's
delete it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 15:12:04 -07:00
98961e42f0 refs.[ch]: remove unused ref_storage_backend_exists()
This function was added in 3dce444f17 (refs: add a backend method
structure, 2016-09-04), but has never been used by anything. The only
caller that might care uses find_ref_storage_backend() directly.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 15:12:04 -07:00
73c5f67071 config.c: remove unused git_config_key_is_valid()
The git_config_key_is_valid() function got left behind in a
refactoring in a9bcf6586d (alias: use the early config machinery to
expand aliases, 2017-06-14),

It previously had two users when it was added in 9e9de18f1a (config:
silence warnings for command names with invalid keys, 2015-08-24), and
after 6a1e1bc0a1 (pager: use callbacks instead of configset,
2016-09-12) only one remained.

By removing it we can get rid of the "quiet" branches in this
function, as well as cases where "store_key" is NULL, for which there
are no other users.

Out of the 5 callers of git_config_parse_key() only one needs to pass
a non-NULL "size_t *baselen_", so we could remove the third parameter
from the public interface. I did not find that potential
simplification to be worthwhile.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 14:54:15 -07:00
abf897bacd string-list.[ch]: remove string_list_init() compatibility function
Remove this function left over to accommodate in-flight changes, see
770fedaf9f (string-list.[ch]: add a string_list_init_{nodup,dup}(),
2021-07-01) for the recent change to add
"string_list_init_{nodup,dup}()" initializers.

There was only one user of the API left in remote-curl.c. I don't know
why I didn't include this change to remote-curl.c in
bc40dfb10a (string-list.h users: change to use *_{nodup,dup}(),
2021-07-01), perhaps I just missed it.

In any case, let's change that one user to use the new API, as of
writing this there are no in-flight changes that use, so this seems
like a good time to drop this before we get any new users of this
compatibility API.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 14:43:38 -07:00
cefe983a32 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 13:06:53 -07:00
45d141a1dd Merge branch 'en/typofixes'
Typofixes.

* en/typofixes:
  merge-ort: fix completely wrong comment
  trace2.h: fix trivial comment typo
2021-09-28 13:06:53 -07:00
3d875f96f1 Merge branch 'cb/unicode-14'
The unicode character width table (used for output alignment) has
been updated.

* cb/unicode-14:
  unicode: update the width tables to Unicode 14
2021-09-28 13:06:53 -07:00
bb1677fc29 Merge branch 'jk/reduce-malloc-in-v2-servers'
Code cleanup to limit memory consumption and tighten protocol
message parsing.

* jk/reduce-malloc-in-v2-servers:
  ls-refs: reject unknown arguments
  serve: reject commands used as capabilities
  serve: reject bogus v2 "command=ls-refs=foo"
  docs/protocol-v2: clarify some ls-refs ref-prefix details
  ls-refs: ignore very long ref-prefix counts
  serve: drop "keys" strvec
  serve: provide "receive" function for session-id capability
  serve: provide "receive" function for object-format capability
  serve: add "receive" method for v2 capabilities table
  serve: return capability "value" from get_capability()
  serve: rename is_command() to parse_command()
2021-09-28 13:06:53 -07:00
6579e788c0 advice: update message to suggest '--sparse'
The previous changes modified the behavior of 'git add', 'git rm', and
'git mv' to not adjust paths outside the sparse-checkout cone, even if
they exist in the working tree and their cache entries lack the
SKIP_WORKTREE bit. The intention is to warn users that they are doing
something potentially dangerous. The '--sparse' option was added to each
command to allow careful users the same ability they had before.

To improve the discoverability of this new functionality, add a message
to advice.updateSparsePath that mentions the existence of the option.

The previous set of changes also modified the purpose of this message to
include possibly a list of paths instead of only a list of pathspecs.
Make the warning message more clear about this new behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
93d2c16041 mv: refuse to move sparse paths
Since cmd_mv() does not operate on cache entries and instead directly
checks the filesystem, we can only use path_in_sparse_checkout() as a
mechanism for seeing if a path is sparse or not. Be sure to skip
returning a failure if '-k' is specified.

To ensure that the advice around sparse paths is the only reason a move
failed, be sure to check this as the very last thing before inserting
into the src_for_dst list.

The tests cover a variety of cases such as whether the target is tracked
or untracked, and whether the source or destination are in or outside of
the sparse-checkout definition.

Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
d7c4415e55 rm: skip sparse paths with missing SKIP_WORKTREE
If a path does not match the sparse-checkout cone but is somehow missing
the SKIP_WORKTREE bit, then 'git rm' currently succeeds in removing the
file. One reason a user might be in this situation is a merge conflict
outside of the sparse-checkout cone. Removing such a file might be
problematic for users who are not sure what they are doing.

Add a check to path_in_sparse_checkout() when 'git rm' is checking if a
path should be considered for deletion. Of course, this check is ignored
if the '--sparse' option is specified, allowing users who accept the
risks to continue with the removal.

This also removes a confusing behavior where a user asks for a directory
to be removed, but only the entries that are within the sparse-checkout
definition are removed. Now, 'git rm <dir>' will fail without '--sparse'
and will succeed in removing all contained paths with '--sparse'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
f9786f9b85 rm: add --sparse option
As we did previously in 'git add', add a '--sparse' option to 'git rm'
that allows modifying paths outside of the sparse-checkout definition.
The existing checks in 'git rm' are restricted to tracked files that
have the SKIP_WORKTREE bit in the current index. Future changes will
cause 'git rm' to reject removing paths outside of the sparse-checkout
definition, even if they are untracked or do not have the SKIP_WORKTREE
bit.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
61d450f049 add: update --renormalize to skip sparse paths
We added checks for path_in_sparse_checkout() to portions of 'git add'
that add warnings and prevent stagins a modification, but we skipped the
--renormalize mode. Update renormalize_tracked_files() to ignore cache
entries whose path is outside of the sparse-checkout cone (unless
--sparse is provided). Add a test in t3705.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
63b60b3add add: update --chmod to skip sparse paths
We added checks for path_in_sparse_checkout() to portions of 'git add'
that add warnings and prevent staging a modification, but we skipped the
--chmod mode. Update chmod_pathspec() to ignore cache entries whose path
is outside of the sparse-checkout cone (unless --sparse is provided).
Add a test in t3705.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
0299a69694 add: implement the --sparse option
We previously modified 'git add' to refuse updating index entries
outside of the sparse-checkout cone. This is justified to prevent users
from accidentally getting into a confusing state when Git removes those
files from the working tree at some later point.

Unfortunately, this caused some workflows that were previously possible
to become impossible, especially around merge conflicts outside of the
sparse-checkout cone. These were documented in tests within t1092.

We now re-enable these workflows using a new '--sparse' option to 'git
add'. This allows users to signal "Yes, I do know what I'm doing with
these files," and accept the consequences of the files leaving the
worktree later.

We delay updating the advice message until implementing a similar option
in 'git rm' and 'git mv'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
49fdd51a23 add: skip tracked paths outside sparse-checkout cone
When 'git add' adds a tracked file that is outside of the
sparse-checkout cone, it checks the SKIP_WORKTREE bit to see if the file
exists outside of the sparse-checkout cone. This is usually correct,
except in the case of a merge conflict outside of the cone.

Modify add_pathspec_matched_against_index() to be more careful about
paths by checking the sparse-checkout patterns in addition to the
SKIP_WORKTREE bit. This causes 'git add' to no longer allow files
outside of the cone that removed the SKIP_WORKTREE bit due to a merge
conflict.

With only this change, users will only be able to add the file after
adding the file to the sparse-checkout cone. A later change will allow
users to force adding even though the file is outside of the
sparse-checkout cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
105e8b014b add: fail when adding an untracked sparse file
The add_files() method in builtin/add.c takes a set of untracked files
that are being added by the input pathspec and inserts them into the
index. If these files are outside of the sparse-checkout cone, then they
gain the SKIP_WORKTREE bit at some point. However, this was not checked
before inserting into the index, so these files are added even though we
want to avoid modifying the index outside of the sparse-checkout cone.

Add a check within add_files() for these files and write the advice
about files outside of the sparse-checkout cone.

This behavior change modifies some existing tests within t1092. These
tests intended to document how a user could interact with the existing
behavior in place. Many of these tests need to be marked as expecting
failure. A future change will allow these tests to pass by adding a flag
to 'git add' that allows users to modify index entries outside of the
sparse-checkout cone.

The 'submodule handling' test is intended to document what happens to
directories that contain a submodule when the sparse index is enabled.
It is not trying to say that users should be able to add submodules
outside of the sparse-checkout cone, so that test can be modified to
avoid that operation.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
ed4958477b dir: fix pattern matching on dirs
Within match_pathname(), one successful matching category happens when
the pattern is equal to its non-wildcard prefix. At this point, we have
checked that the input 'pathname' matches the pattern up to the prefix
length, and then we subtraced that length from both 'patternlen' and
'namelen'.

In the case of a directory match, this prefix match should be
sufficient. However, the success condition only cared about _exact_
equality here. Instead, we should allow any path that agrees on this
prefix in the case of PATTERN_FLAG_MUSTBEDIR.

This case was not tested before because of the way unpack_trees() would
match a parent directory before visiting the contained paths. This
approach is changing, so we must change this comparison.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
f6526728f9 dir: select directories correctly
When matching a path against a list of patterns, the ones that require a
directory match previously did not work when a filename is specified.
This was fine when all pattern-matching was done within methods such as
unpack_trees() that check a directory before recursing into the
contained files. However, other commands will start matching individual
files against pattern lists without that recursive approach.

The last_matching_pattern_from_list() logic performs some checks on the
filetype of a path within the index when the PATTERN_FLAG_MUSTBEDIR flag
is set. This works great when setting SKIP_WORKTREE bits within
unpack_trees(), but doesn't work well when passing an arbitrary path
such as a file within a matching directory.

We extract the logic around determining the file type, but attempt to
avoid checking the filesystem if the parent directory already matches
the sparse-checkout patterns. The new path_matches_dir_pattern() method
includes a 'path_parent' parameter that is used to store the parent
directory of 'pathname' between multiple pattern matching tests. This is
loaded lazily, only on the first pattern it finds that has the
PATTERN_FLAG_MUSTBEDIR flag.

If we find that a path has a parent directory, we start by checking to
see if that parent directory matches the pattern. If so, then we do not
need to query the index for the type (which can be expensive). If we
find that the parent does not match, then we still must check the type
from the index for the given pathname.

Note that this does not affect cone mode pattern matching, but instead
the more general -- and slower -- full pattern set. Thus, this does not
affect the sparse index.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
edd2cd345f t1092: behavior for adding sparse files
Add some tests to demonstrate the current behavior around adding files
outside of the sparse-checkout cone. Currently, untracked files are
handled differently from tracked files. A future change will make these
cases be handled the same way.

Further expand checking that a failed 'git add' does not stage changes
to the index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 10:31:02 -07:00
670e597399 maintenance: fix test t7900-maintenance.sh
Commit b681b191 introduced the support of systemd timers for git
maintenance.
A test is leveraging the `systemd-analyze verify` utility to verify the
correctness of the systemd unit files generated by git.

But on some systems, although the `systemd-analyze` tool is installed
and supports the `verify` subcommand, it fails with some permission
errors.

So, instead of only checking if the `verify` subcommand exists, a more
reliable way of detecting whether `systemd-analyze verify` can be used
is to try to use it.

The SYSTEMD_ANALYZE prerequisite is now trying to run `systemd-analyze
verify` on a systemd unit file which is shipped by systemd itself.
We can reasonably think that, on systemd hosts, this file is present and
valid.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 16:06:59 -07:00
4eb2bfdc92 builtin/blame.c: refactor commit_info_init() to COMMIT_INFO_INIT macro
Remove the commit_info_init() function addded in ea02ffa385 (mailmap:
simplify map_user() interface, 2013-01-05) and instead initialize the
"struct commit_info" with a macro.

This is the more idiomatic pattern in the codebase, and doesn't leave
us wondering when we see the *_init() function if this struct needs
more complex initialization than a macro can provide.

The get_commit_info() function is only called by the three callers
being changed here immediately after initializing the struct with the
macros, so by moving the initialization to the callers we don't need
to do it in get_commit_info() anymore.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 15:02:32 -07:00
6e54a32468 daemon.c: refactor hostinfo_init() to HOSTINFO_INIT macro
Remove the hostinfo_init() function added in 01cec54e13 (daemon:
deglobalize hostname information, 2015-03-07) and instead initialize
the "struct hostinfo" with a macro.

This is the more idiomatic pattern in the codebase, and doesn't leave
us wondering when we see the *_init() function if this struct needs
more complex initialization than a macro can provide.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 15:02:32 -07:00
a30321b9ea Merge branch 'ab/designated-initializers' into ab/designated-initializers-more
* ab/designated-initializers:
  cbtree.h: define cb_init() in terms of CBTREE_INIT
  *.h: move some *_INIT to designated initializers
  *.h _INIT macros: don't specify fields equal to 0
  *.[ch] *_INIT macros: use { 0 } for a "zero out" idiom
  submodule-config.h: remove unused SUBMODULE_INIT macro
2021-09-27 15:02:13 -07:00
538835d2ac cbtree.h: define cb_init() in terms of CBTREE_INIT
Use the same pattern for cb_init() as the one established in the
recent refactoring of other such patterns in
5726a6b401 (*.c *_init(): define in terms of corresponding *_INIT
macro, 2021-07-01).

It has been pointed out[1] that we could perhaps use this C99
replacement of using a compound literal for all of these:

    *t = (struct cb_tree){ 0 };

But let's just stick to the existing pattern established in
5726a6b401 for now, we can leave another weather balloon for some
other time.

1. http://lore.kernel.org/git/ef724a3a-a4b8-65d3-c928-13a7d78f189a@gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 14:48:00 -07:00
f69a6e4f07 *.h: move some *_INIT to designated initializers
Move various *_INIT macros to use designated initializers. This helps
readability. I've only picked those leftover macros that were not
touched by another in-flight series of mine which changed others, but
also how initialization was done.

In the case of SUBMODULE_ALTERNATE_SETUP_INIT I've left an explicit
initialization of "error_mode", even though
SUBMODULE_ALTERNATE_ERROR_IGNORE itself is defined as "0". Let's not
peek under the hood and assume that enum fields we know the value of
will stay at "0".

The change to "TESTSUITE_INIT" in "t/helper/test-run-command.c" was
part of an earlier on-list version[1] of c90be786da (test-tool
run-command: fix flip-flop init pattern, 2021-09-11).

1. https://lore.kernel.org/git/patch-1.1-0aa4523ab6e-20210909T130849Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 14:48:00 -07:00
608cfd31cf *.h _INIT macros: don't specify fields equal to 0
Change the initialization of "struct strbuf" changed in
cbc0f81d96 (strbuf: use designated initializers in STRBUF_INIT,
2017-07-10) to omit specifying "alloc" and "len", as we do with other
"alloc" and "len" (or "nr") in similar structs.

Let's likewise omit the explicit initialization of all fields in the
"struct ipc_client_connect_option" struct added in
59c7b88198 (simple-ipc: add win32 implementation, 2021-03-15).

Do the same for a few other initializers, e.g. STRVEC_INIT and
CACHE_DEF_INIT.

Finally, start incrementally changing the same pattern in
"t/helper/test-run-command.c". This change was part of an earlier
on-list version[1] of c90be786da (test-tool run-command: fix
flip-flop init pattern, 2021-09-11).

1. https://lore.kernel.org/git/patch-1.1-0aa4523ab6e-20210909T130849Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 14:47:59 -07:00
9865b6e6a4 *.[ch] *_INIT macros: use { 0 } for a "zero out" idiom
In C it isn't required to specify that all members of a struct are
zero'd out to 0, NULL or '\0', just providing a "{ 0 }" will
accomplish that.

Let's also change code that provided N zero'd fields to just
provide one, and change e.g. "{ NULL }" to "{ 0 }" for
consistency. I.e. even if the first member is a pointer let's use "0"
instead of "NULL". The point of using "0" consistently is to pick one,
and to not have the reader wonder why we're not using the same pattern
everywhere.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 14:47:59 -07:00
9d444d9ee0 submodule-config.h: remove unused SUBMODULE_INIT macro
This macro was added and used in c68f837576 (implement fetching of
moved submodules, 2017-10-16) but its last user went away in
be76c21282 (fetch: ensure submodule objects fetched, 2018-12-06).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 14:47:59 -07:00
dd20e4a6db Makefile: pass -Wno-pendantic under GENERATE_COMPILATION_DATABASE=yes
The same bug fixed in the "COMPUTE_HEADER_DEPENDENCIES=auto" mode in
the preceding commit was also present with
"GENERATE_COMPILATION_DATABASE=yes". Let's fix it so it works again
with "DEVOPTS=1".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:54:02 -07:00
0e29222e0c Documentation: call out commands that nuke untracked files/directories
Some commands have traditionally also removed untracked files (or
directories) that were in the way of a tracked file we needed.  Document
these cases.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
94b7f1563a Comment important codepaths regarding nuking untracked files/dirs
In the last few commits we focused on code in unpack-trees.c that
mistakenly removed untracked files or directories.  There may be more of
those, but in this commit we change our focus: callers of toplevel
commands that are expected to remove untracked files or directories.

As noted previously, we have toplevel commands that are expected to
delete untracked files such as 'read-tree --reset', 'reset --hard', and
'checkout --force'.  However, that does not mean that other highlevel
commands that happen to call these other commands thought about or
conveyed to users the possibility that untracked files could be removed.
Audit the code for such callsites, and add comments near existing
callsites to mention whether these are safe or not.

My auditing is somewhat incomplete, though; it skipped several cases:
  * git-rebase--preserve-merges.sh: is in the process of being
    deprecated/removed, so I won't leave a note that there are
    likely more bugs in that script.
  * contrib/git-new-workdir: why is the -f flag being used in a new
    empty directory??  It shouldn't hurt, but it seems useless.
  * git-p4.py: Don't see why -f is needed for a new dir (maybe it's
    not and is just superfluous), but I'm not at all familiar with
    the p4 stuff
  * git-archimport.perl: Don't care; arch is long since dead
  * git-cvs*.perl: Don't care; cvs is long since dead

Also, the reset --hard in builtin/worktree.c looks safe, due to only
running in an empty directory.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
56d06fe4aa unpack-trees: avoid nuking untracked dir in way of locally deleted file
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
1fdd51aa13 unpack-trees: avoid nuking untracked dir in way of unmerged file
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
480d3d6bf9 Change unpack_trees' 'reset' flag into an enum
Traditionally, unpack_trees_options->reset was used to signal that it
was okay to delete any untracked files in the way.  This was used by
`git read-tree --reset`, but then started appearing in other places as
well.  However, many of the other uses should not be deleting untracked
files in the way.  Change this value to an enum so that a value of 1
(i.e. "true") can be split into two:
   UNPACK_RESET_PROTECT_UNTRACKED,
   UNPACK_RESET_OVERWRITE_UNTRACKED
In order to catch accidental misuses (i.e. where folks call it the way
they traditionally used to), define the special enum value of
   UNPACK_RESET_INVALID = 1
which will trigger a BUG().

Modify existing callers so that
   read-tree --reset
   reset --hard
   checkout --force
continue using the UNPACK_RESET_OVERWRITE_UNTRACKED logic, while other
callers, including
   am
   checkout without --force
   stash  (though currently dead code; reset always had a value of 0)
   numerous callers from rebase/sequencer to reset_head()
will use the new UNPACK_RESET_PROTECT_UNTRACKED value.

Also, note that it has been reported that 'git checkout <treeish>
<pathspec>' currently also allows overwriting untracked files[1].  That
case should also be fixed, but it does not use unpack_trees() and thus
is outside the scope of the current changes.

[1] https://lore.kernel.org/git/15dad590-087e-5a48-9238-5d2826950506@gmail.com/

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
1b5f37334a Remove ignored files by default when they are in the way
Change several commands to remove ignored files by default when they are
in the way.  Since some commands (checkout, merge) take a
--no-overwrite-ignore option to allow the user to configure this, and it
may make sense to add that option to more commands (and in the case of
merge, actually plumb that configuration option through to more of the
backends than just the fast-forwarding special case), add little
comments about where such flags would be used.

Incidentally, this fixes a test failure in t7112.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
c42e0b6409 unpack-trees: make dir an internal-only struct
Avoid accidental misuse or confusion over ownership by clearly making
unpack_trees_options.dir an internal-only variable.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
04988c8d18 unpack-trees: introduce preserve_ignored to unpack_trees_options
Currently, every caller of unpack_trees() that wants to ensure ignored
files are overwritten by default needs to:
   * allocate unpack_trees_options.dir
   * flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags
   * call setup_standard_excludes
AND then after the call to unpack_trees() needs to
   * call dir_clear()
   * deallocate unpack_trees_options.dir
That's a fair amount of boilerplate, and every caller uses identical
code.  Make this easier by instead introducing a new boolean value where
the default value (0) does what we want so that new callers of
unpack_trees() automatically get the appropriate behavior.  And move all
the handling of unpack_trees_options.dir into unpack_trees() itself.

While preserve_ignored = 0 is the behavior we feel is the appropriate
default, we defer fixing commands to use the appropriate default until a
later commit.  So, this commit introduces several locations where we
manually set preserve_ignored=1.  This makes it clear where code paths
were previously preserving ignored files when they should not have been;
a future commit will flip these to instead use a value of 0 to get the
behavior we want.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
491a7575f1 read-tree, merge-recursive: overwrite ignored files by default
This fixes a long-standing patchwork of ignored files handling in
read-tree and merge-recursive, called out and suggested by Junio long
ago.  Quoting from commit dcf0c16ef1 ("core.excludesfile clean-up"
2007-11-16):

    git-read-tree takes --exclude-per-directory=<gitignore>,
    not because the flexibility was needed.  Again, this was
    because the option predates the standardization of the ignore
    files.

    ...

    On the other hand, I think it makes perfect sense to fix
    git-read-tree, git-merge-recursive and git-clean to follow the
    same rule as other commands.  I do not think of a valid use case
    to give an exclude-per-directory that is nonstandard to
    read-tree command, outside a "negative" test in the t1004 test
    script.

    This patch is the first step to untangle this mess.

    The next step would be to teach read-tree, merge-recursive and
    clean (in C) to use setup_standard_excludes().

History shows each of these were partially or fully fixed:

  * clean was taught the new trick in 1617adc7a0 ("Teach git clean to
    use setup_standard_excludes()", 2007-11-14).

  * read-tree was primarily used by checkout & merge scripts.  checkout
    and merge later became builtins and were both fixed to use the new
    setup_standard_excludes() handling in fc001b526c ("checkout,merge:
    loosen overwriting untracked file check based on info/exclude",
    2011-11-27).  So the primary users were fixed, though read-tree
    itself was not.

  * merge-recursive has now been replaced as the default merge backend
    by merge-ort.  merge-ort fixed this by using
    setup_standard_excludes() starting early in its implementation; see
    commit 6681ce5cf6 ("merge-ort: add implementation of checkout()",
    2020-12-13), largely due to its design depending on checkout() and
    thus being influenced by the checkout code.  However,
    merge-recursive itself was not fixed here, in part because its
    design meant it had difficulty differentiating between untracked
    files, ignored files, leftover tracked files that haven't been
    removed yet due to order of processing files, and files written by
    itself due to collisions).

Make the conversion more complete by now handling read-tree and
handling at least the unpack_trees() portion of merge-recursive.  While
merge-recursive is on its way out, fixing the unpack_trees() portion is
easy and facilitates some of the later changes in this series.  Note
that fixing read-tree makes the --exclude-per-directory option to
read-tree useless, so we remove it from the documentation (though we
continue to accept it if passed).

The read-tree changes happen to fix a bug in t1013.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
c512d27e78 checkout, read-tree: fix leak of unpack_trees_options.dir
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
2d84c4ed57 lazyload.h: use an even more generic function pointer than FARPROC
gcc will helpfully raise a -Wcast-function-type warning when casting
between functions that might have incompatible return types
(ex: GetUserNameExW returns bool which is only half the size of the
return type from FARPROC which is long long), so create a new type that
could be used as a completely generic function pointer and cast through
it instead.

Additionaly remove the -Wno-incompatible-pointer-types temporary
flag added in 27e0c3c (win32: allow building with pedantic mode
enabled, 2021-09-03), as it will be no longer needed.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:13:58 -07:00
67985e4e4a refs: drop "broken" flag from for_each_fullref_in()
No callers pass in anything but "0" here. Likewise to our sibling
functions. Note that some of them ferry along the flag, but none of
their callers pass anything but "0" either.

Nor is anybody likely to change that. Callers which really want to see
all of the raw refs use for_each_rawref(). And anybody interested in
iterating a subset of the refs will likely be happy to use the
now-default behavior of showing broken refs, but omitting dangling
symlinks.

So we can get rid of this whole feature.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
2d653c5036 ref-filter: drop broken-ref code entirely
Now that none of our callers passes the INCLUDE_BROKEN flag, we can drop
it entirely, along with the code to plumb it through to the
for_each_fullref_in() functions.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
1763334caf ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN
Of the ref-filter callers, for-each-ref and git-branch both set the
INCLUDE_BROKEN flag (but git-tag does not, which is a weird
inconsistency).  But now that GIT_REF_PARANOIA is on by default, that
produces almost the same outcome for all three.

The one exception is that GIT_REF_PARANOIA will omit dangling symrefs.
That's a better behavior for these tools, as they would never include
such a symref in the main output anyway (they can't, as it doesn't point
to an object). Instead they issue a warning to stderr. But that warning
is somewhat useless; a dangling symref is a perfectly reasonable thing
to have in your repository, and is not a sign of corruption. It's much
friendlier to just quietly ignore it.

And in terms of robustness, the warning gains us little. It does not
impact the exit code of either tool. So while the warning _might_ clue
in a user that they have an unexpected broken symref, it would not help
any kind of scripted use.

This patch converts for-each-ref and git-branch to stop using the
INCLUDE_BROKEN flag. That gives them more reasonable behavior, and
harmonizes them with git-tag.

We have to change one test to adapt to the situation. t1430 tries to
trigger all of the REF_ISBROKEN behaviors from the underlying ref code.
It uses for-each-ref to do so (because there isn't any other mechanism).
That will no longer issue a warning about the symref which points to an
invalid name, as it's considered dangling (and we can instead be sure
that it's _not_ mentioned on stderr). Note that we do still complain
about the illegally named "broken..symref"; its problem is not that it's
dangling, but the name of the symref itself is illegal.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
5d1f5b8cd4 repack, prune: drop GIT_REF_PARANOIA settings
Now that GIT_REF_PARANOIA is the default, we don't need to selectively
enable it for destructive operations. In fact, it's harmful to do so,
because it overrides any GIT_REF_PARANOIA=0 setting that the user may
have provided (because they're trying to work around some corruption).

With these uses gone, we can further clean up the ref_paranoia global,
and make it a static variable inside the refs code.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
968f12fdac refs: turn on GIT_REF_PARANOIA by default
The original point of the GIT_REF_PARANOIA flag was to include broken
refs in iterations, so that possibly-destructive operations would not
silently ignore them (and would generally instead try to operate on the
oids and fail when the objects could not be accessed).

We already turned this on by default for some dangerous operations, like
"repack -ad" (where missing a reachability tip would mean dropping the
associated history). But it was not on for general use, even though it
could easily result in the spreading of corruption (e.g., imagine
cloning a repository which simply omits some of its refs because
their objects are missing; the result quietly succeeds even though you
did not clone everything!).

This patch turns on GIT_REF_PARANOIA by default. So a clone as mentioned
above would actually fail (upload-pack tells us about the broken ref,
and when we ask for the objects, pack-objects fails to deliver them).
This may be inconvenient when working with a corrupted repository, but:

  - we are better off to err on the side of complaining about
    corruption, and then provide mechanisms for explicitly loosening
    safety.

  - this is only one type of corruption anyway. If we are missing any
    other objects in the history that _aren't_ ref tips, then we'd
    behave similarly (happily show the ref, but then barf when we
    started traversing).

We retain the GIT_REF_PARANOIA variable, but simply default it to "1"
instead of "0". That gives the user an escape hatch for loosening this
when working with a corrupt repository. It won't work across a remote
connection to upload-pack (because we can't necessarily set environment
variables on the remote), but there the client has other options (e.g.,
choosing which refs to fetch).

As a bonus, this also makes ref iteration faster in general (because we
don't have to call has_object_file() for each ref), though probably not
noticeably so in the general case. In a repo with a million refs, it
shaved a few hundred milliseconds off of upload-pack's advertisement;
that's noticeable, but most repos are not nearly that large.

The possible downside here is that any operation which iterates refs but
doesn't ever open their objects may now quietly claim to have X when the
object is corrupted (e.g., "git rev-list new-branch --not --all" will
treat a broken ref as uninteresting). But again, that's not really any
different than corruption below the ref level. We might have
refs/heads/old-branch as non-corrupt, but we are not actively checking
that we have the entire reachable history. Or the pointed-to object
could even be corrupted on-disk (but our "do we have it" check would
still succeed). In that sense, this is merely bringing ref-corruption in
line with general object corruption.

One alternative implementation would be to actually check for broken
refs, and then _immediately die_ if we see any. That would cause the
"rev-list --not --all" case above to abort immediately. But in many ways
that's the worst of all worlds:

  - it still spends time looking up the objects an extra time

  - it still doesn't catch corruption below the ref level

  - it's even more inconvenient; with the current implementation of
    GIT_REF_PARANOIA for something like upload-pack, we can make
    the advertisement and let the client choose a non-broken piece of
    history. If we bail as soon as we see a broken ref, they cannot even
    see the advertisement.

The test changes here show some of the fallout. A non-destructive "git
repack -adk" now fails by default (but we can override it). Deleting a
broken ref now actually tells the hooks the correct "before" state,
rather than a confusing null oid.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
6d751be4b6 refs: omit dangling symrefs when using GIT_REF_PARANOIA
Dangling symrefs aren't actually a corruption problem. It's perfectly
fine for refs/remotes/origin/HEAD to point to an unborn branch. And in
particular, if you are trying to establish reachability, a symref that
points nowhere doesn't matter either way. Any ref it could point to will
be examined during the rest of the traversal.

It's possible that a symref pointing nowhere _could_ be a sign that the
ref it was meant to point to was deleted accidentally (e.g., via
corruption). But there is no particular reason to think that is true for
any given case, and in the meantime, GIT_REF_PARANOIA kicking in
automatically for some operations means they'll fail unnecessarily.

So let's loosen it just a bit. The new test in t5312 shows off an
example that is safe, but currently fails (and no longer does after this
patch).

Note that we don't do anything if the caller explicitly asked for
DO_FOR_EACH_INCLUDE_BROKEN. In that case they may be looking for
dangling symrefs themselves, and setting GIT_REF_PARANOIA should not
_loosen_ things from what the caller asked for.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
8dccb2244c refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag
When the DO_FOR_EACH_INCLUDE_BROKEN flag is used, we include both actual
corrupt refs (illegal names, missing objects), but also symrefs that
point to nothing. This latter is not really a corruption, but just
something that may happen normally. For example, the symref at
refs/remotes/origin/HEAD may point to a tracking branch which is later
deleted. (The local HEAD may also be unborn, of course, but we do not
access it through ref iteration).

Most callers of for_each_ref() etc, do not care. They don't pass
INCLUDE_BROKEN, so don't see it at all. But for those which do pass it,
this somewhat-normal state causes extra warnings (e.g., from
for-each-ref) or even aborts operations (destructive repacks with
GIT_REF_PARANOIA set).

This patch just introduces the flag and the mechanism; there are no
callers yet (and hence no tests). Two things to note on the
implementation:

  - we actually skip any symref that does not resolve to a ref. This
    includes ones which point to an invalidly-named ref. You could argue
    this is a more serious breakage than simple dangling. But the
    overall effect is the same (we could not follow the symref), as well
    as the impact on things like REF_PARANOIA (either way, a symref we
    can't follow won't impact reachability, because we'll see the ref
    itself during iteration). The underlying resolution function doesn't
    distinguish these two cases (they both get REF_ISBROKEN).

  - we change the iterator in refs/files-backend.c where we check
    INCLUDE_BROKEN. There's a matching spot in refs/packed-backend.c,
    but we don't know need to do anything there. The packed backend does
    not support symrefs at all.

The resulting set of flags might be a bit easier to follow if we broke
this down into "INCLUDE_CORRUPT_REFS" and "INCLUDE_DANGLING_SYMREFS".
But there are a few reasons not do so:

  - adding a new OMIT_DANGLING_SYMREFS flag lets us leave existing
    callers intact, without changing their behavior (and some of them
    really do want to see the dangling symrefs; e.g., t5505 has a test
    which expects us to report when a symref becomes dangling)

  - they're not actually independent. You cannot say "include dangling
    symrefs" without also including refs whose objects are not
    reachable, because dangling symrefs by definition do not have an
    object. We could tweak the implementation to distinguish this, but
    in practice nobody wants to ask for that. Adding the OMIT flag keeps
    the implementation simple and makes sure we don't regress the
    current behavior.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
9aab952e85 refs-internal.h: reorganize DO_FOR_EACH_* flag documentation
The documentation for the DO_FOR_EACH_* flags is sprinkled over the
refs-internal.h file. We define the two flags in one spot, and then
describe them in more detail far away from there, in the definitions of
refs_ref_iterator_begin() and ref_iterator_advance_fn().

Let's try to organize this a bit better:

  - convert the #defines to an enum. This makes it clear that they are
    related, and that the enum shows the complete set of flags.

  - combine all descriptions for each flag in a single spot, next to the
    flag's definition

  - use the enum rather than a bare int for functions which take the
    flags. This helps readers realize which flags can be used.

  - clarify the mention of flags for ref_iterator_advance_fn(). It does
    not take flags itself, but is meant to depend on ones set up
    earlier.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
bf708add2e refs-internal.h: move DO_FOR_EACH_* flags next to each other
There are currently two DO_FOR_EACH_* flags, which must not have their
bits overlap. Yet they're defined hundreds of lines apart. Let's move
them next to each other to make it clear that they are related and are a
complete set (which matters if you are adding a new flag and would like
to know what the next available bit is).

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
5b062e1f79 t5312: be more assertive about command failure
When repacking or pruning in a corrupted repository, our tests in t5312
argue that it is OK to complete the operation or bail, as long as we
don't actually delete the objects pointed to by the corruption.

This isn't a wrong line of reasoning, but the tests are a bit permissive
by using test_might_fail. The fact is that we _do_ bail currently, and
if we ever stopped doing so, that would be worthy of a human
investigating. So let's switch these to test_must_fail.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
078eecbcbe t5312: test non-destructive repack
In t5312, we create a state with a broken ref, and then make sure that
destructive repacks don't silently ignore the breakage (where a
destructive repack is one that might drop objects). But we don't check
the behavior of non-destructive repacks at all (i.e., ones where we'd
keep unreachable objects).

So let's add a test to confirm the current behavior, which is that
they are allowed (i.e., ignoring the breakage and considering any
objects it points to as unreachable). This may change in the future, but
we'd like for the test suite to alert us to that fact.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:45 -07:00
f805844676 t5312: create bogus ref as necessary
Some tests in t5312 create an illegally-named ref, and then see how
various operations handle it. But between those operations, we also do
some more setup (e.g., repacking), and we are subtly depending on how
those setup steps react to the illegal ref.

To future-proof us against those behaviors changing, let's instead
create and clean up our bogus ref on demand in the tests that need it.

This has two small extra advantages:

 - the tests are more stand-alone; we do not need an extra test to clean
   up the ref before moving on to other parts of the script

 - the creation and cleanup is together in one helper function. Because
   these depend on touching the refs in the filesystem directly, they
   may need to be tweaked for a world with alternate backends (they have
   not been noticed so far in the reftable work because with a non-file
   backend the tests don't fail; they simply become uninteresting noops
   because the broken ref isn't read at all).

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:44 -07:00
2ac0cbc9b0 t5312: drop "verbose" helper
t5312 has several uses of the "verbose" helper, as described in
8ad1652418 (t5304: use helper to report failure of "test foo = bar",
2014-10-10). Back then the "-x" trace option for tests was new, and was
not as pleasant to use (e.g., some tests failed under "-x", we did not
support BASH_XTRACEFD, etc).

These days it is clear that "-x" is the preferred way to get extra
output, and we don't need to mark up individual tests. Let's get rid of
the uses of "verbose" here, as one step toward eradicating it totally.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:44 -07:00
da5e0c6a00 t5600: provide detached HEAD for corruption failures
When checking how git-clone behaves when it fails, we stimulate some
failures by trying to do a clone from a local repository whose objects
have been removed. Because these clones use local optimizations, there's
a subtle dependency in how the corruption is handled on the sending
side.

If upload-pack does not show us the broken refs (which it does not
currently), then we see only HEAD (which is itself broken), and clone
that as a detached HEAD. When we try to write the ref, we notice that we
never got the object and bail.

But if upload-pack _does_ show us the broken refs (which it may in a
future patch), then we'll realize that HEAD is a symref and just write
that. You'd think we'd fail when writing out the refs themselves, but we
don't; we do a bulk write and skip the connectivity check because of our
--local optimizations. For the non-bare case, we do notice the problem
when we try to checkout. But for a bare repository, we unexpectedly
complete the clone successfully!

At first glance this may seem like a bug. But the whole point of those
local optimizations is to give up some safety for speed. If you want to
be careful, you should be using "--no-local", which would notice that
the pack did not transfer sufficient objects. We could do that in these
tests, but part of the point is for them to fail at specific moments
(and indeed, we have a later test that checks for transport failure).

However, we can make this less subtle and future-proof it against
changes on the upload-pack side by just having an explicit detached
HEAD in the corrupted repo. Now we'll fail as expected during the ref
write if any ref _or_ HEAD is corrupt, whether we're --bare or not.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:44 -07:00
e9de7a52a5 t5516: don't use HEAD ref for invalid ref-deletion tests
A few tests in t5516 want to assert that we can delete a corrupted ref
whose pointed-to object is missing. They do so by using the "main"
branch, which is also pointed to by HEAD.

This does work, but only because of a subtle assumption about the
implementation. We do not block the deletion because of the invalid ref,
but we _also_ do not notice that the deleted branch is pointed to by
HEAD. And so the safety rule of "do not allow HEAD to be deleted in a
non-bare repository" does not kick in, and the test passes.

Let's instead use a non-HEAD branch. That still tests what we care about
here (deleting a corrupt ref), but without implicitly depending on our
failure to notice that we're deleting HEAD. That will future proof the
test against that behavior changing.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:44 -07:00
b4724242fa t7900: clean up some more broken refs
The "incremental-repack task" test replaces the object directory with a
known state. As a result, some of our refs point to objects that are not
included in that state.

Commit 3cf5f221be (t7900: clean up some broken refs, 2021-01-19) cleaned
up some of those (that were causing warnings to stderr from the
maintenance process). But there are a few more that were missed. These
aren't hurting anything for now, but it's certainly an unexpected state
to leave the test repository in, and it will become a problem if repack
ever gets more picky about broken refs.

Let's clean up those additional refs (which are all in refs/remotes,
with nothing there that isn't broken), and add an extra "for-each-ref"
call to assert that we've got everything.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 12:36:44 -07:00
3e8084f188 http: check CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors
Change the error shown when a http.pinnedPubKey doesn't match to point
the http.pinnedPubKey variable added in aeff8a6121 (http: implement
public key pinning, 2016-02-15), e.g.:

    git -c http.pinnedPubKey=sha256/someNonMatchingKey ls-remote https://github.com/git/git.git
    fatal: unable to access 'https://github.com/git/git.git/' with http.pinnedPubkey configuration: SSL: public key does not match pinned public key!

Before this we'd emit the exact same thing without the " with
http.pinnedPubkey configuration". The advantage of doing this is that
we're going to get a translated message (everything after the ":" is
hardcoded in English in libcurl), and we've got a reference to the
git-specific configuration variable that's causing the error.

Unfortunately we can't test this easily, as there are no tests that
require https:// in the test suite, and t/lib-httpd.sh doesn't know
how to set up such tests. See [1] for the start of a discussion about
what it would take to have divergent "t/lib-httpd/apache.conf" test
setups. #leftoverbits

1. https://lore.kernel.org/git/YUonS1uoZlZEt+Yd@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 10:58:07 -07:00
44d2aec6e8 connect: also update offset for features without values
parse_feature_value() takes an offset, and uses it to seek past the
point in features_list that we've already seen. However if the feature
being searched for does not specify a value, the offset is not
updated. Therefore if we call parse_feature_value() in a loop on a
value-less feature, we'll keep on parsing the same feature over and over
again. This usually isn't an issue: there's no point in using
next_server_feature_value() to search for repeated instances of the same
capability unless that capability typically specifies a value - but a
broken server could send a response that omits the value for a feature
even when we are expecting a value.

Therefore we add an offset update calculation for the no-value case,
which helps ensure that loops using next_server_feature_value() will
always terminate.

next_server_feature_value(), and the offset calculation, were first
added in 2.28 in 2c6a403d96 (connect: add function to parse multiple
v1 capability values, 2020-05-25).

Thanks to Peff for authoring the test.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 10:34:41 -07:00
cfe853e66b hook-list.h: add a generated list of hooks, like config-list.h
Make githooks(5) the source of truth for what hooks git supports, and
punt out early on hooks we don't know about in find_hook(). This
ensures that the documentation and the C code's idea about existing
hooks doesn't diverge.

We still have Perl and Python code running its own hooks, but that'll
be addressed by Emily Shaffer's upcoming "git hook run" command.

This resolves a long-standing TODO item in bugreport.c of there being
no centralized listing of hooks, and fixes a bug with the bugreport
listing only knowing about 1/4 of the p4 hooks. It didn't know about
the recent "reference-transaction" hook either.

We could make the find_hook() function die() or BUG() out if the new
known_hook() returned 0, but let's make it return NULL just as it does
when it can't find a hook of a known type. Making it die() is overly
anal, and unlikely to be what we need in catching stupid typos in the
name of some new hook hardcoded in git.git's sources. By making this
be tolerant of unknown hook names, changes in a later series to make
"git hook run" run arbitrary user-configured hook names will be easier
to implement.

I have not been able to directly test the CMake change being made
here. Since 4c2c38e800 (ci: modification of main.yml to use cmake for
vs-build job, 2020-06-26) some of the Windows CI has a hard dependency
on CMake, this change works there, and is to my eyes an obviously
correct use of a pattern established in previous CMake changes,
namely:

 - 061c2240b1 (Introduce CMake support for configuring Git,
    2020-06-12)
 - 709df95b78 (help: move list_config_help to builtin/help,
    2020-04-16)
 - 976aaedca0 (msvc: add a Makefile target to pre-generate the Visual
   Studio solution, 2019-07-29)

The LC_ALL=C is needed because at least in my locale the dash ("-") is
ignored for the purposes of sorting, which results in a different
order. I'm not aware of anything in git that has a hard dependency on
the order, but e.g. the bugreport output would end up using whatever
locale was in effect when git was compiled.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 09:44:54 -07:00
07a348e746 hook.c users: use "hook_exists()" instead of "find_hook()"
Use the new hook_exists() function instead of find_hook() where the
latter was called in boolean contexts. This make subsequent changes in
a series where we further refactor the hook API clearer, as we won't
conflate wanting to get the path of the hook with checking for its
existence.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 09:44:54 -07:00
330155ed8a hook.c: add a hook_exists() wrapper and use it in bugreport.c
Add a boolean version of the find_hook() function for those callers
who are only interested in checking whether the hook exists, not what
the path to it is.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 09:44:54 -07:00
5e3aba33da hook.[ch]: move find_hook() from run-command.c to hook.c
Move the find_hook() function from run-command.c to a new hook.c
library. This change establishes a stub library that's pretty
pointless right now, but will see much wider use with Emily Shaffer's
upcoming "configuration-based hooks" series.

Eventually all the hook related code will live in hook.[ch]. Let's
start that process by moving the simple find_hook() function over
as-is.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 09:44:54 -07:00
d2c470f9bc lazyload.h: fix warnings about mismatching function pointer types
Here, GCC warns about every use of the INIT_PROC_ADDR macro, for example:

In file included from compat/mingw.c:8:
compat/mingw.c: In function 'mingw_strftime':
compat/win32/lazyload.h:38:12: warning: assignment to
   'size_t (*)(char *, size_t,  const char *, const struct tm *)'
   {aka 'long long unsigned int (*)(char *, long long unsigned int,
      const char *, const struct tm *)'} from incompatible pointer type
   'FARPROC' {aka 'long long int (*)()'} [-Wincompatible-pointer-types]
   38 |  (function = get_proc_addr(&proc_addr_##function))
      |            ^
compat/mingw.c:1014:6: note: in expansion of macro 'INIT_PROC_ADDR'
 1014 |  if (INIT_PROC_ADDR(strftime))
      |      ^~~~~~~~~~~~~~

(message wrapped for convenience). Insert a cast to keep the compiler
happy. A cast is fine in these cases because they are generic function
pointer values that have been looked up in a DLL.

Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 09:31:59 -07:00
ca267aee15 t3705: test that 'sparse_entry' is unstaged
The tests in t3705-add-sparse-checkout.sh check to see how 'git add'
behaves with paths outside the sparse-checkout definition. These
currently check to see if a given warning is present but not that the
index is not updated with the sparse entries. Add a new
'test_sparse_entry_unstaged' helper to be sure 'git add' is behaving
correctly.

We need to modify setup_sparse_entry to actually commit the sparse_entry
file so it exists at HEAD and as an entry in the index, but its exact
contents are not staged in the index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-24 11:43:56 -07:00
446cc5544a t2500: add various tests for nuking untracked files
Noting that unpack_trees treats reset=1 & update=1 as license to nuke
untracked files, I looked for code paths that use this combination and
tried to generate testcases which demonstrated unintentional loss of
untracked files and directories.  I found several.

I also include testcases for `git reset --{hard,merge,keep}`.  A hard
reset is perhaps the most direct test of unpack_tree's reset=1 behavior,
but we cannot make `git reset --hard` preserve untracked files without
some migration work.

Also, the two commands `checkout --force` (because of the --force) and
`read-tree --reset` (because it's plumbing and we need to keep it
backward compatible) were left out as we expect those to continue
removing untracked files and directories.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-24 09:24:25 -07:00
8c6b4332b4 packfile: release bad_objects in close_pack()
Unusable entries of a damaged pack file are recorded in the oidset
bad_objects.  Release it when we're done with the pack.

This doesn't affect intact packs because an empty oidset requires
no allocation.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-24 09:22:46 -07:00
2b88fe0603 rebase: fix todo-list rereading
54fd3243da ("rebase -i: reread the todo list if `exec` touched it",
2017-04-26) sought to reread the todo list after running an exec
command only if it had been changed. To accomplish this it checks the
stat data of the todo list after running an exec command to see if it
has changed. Unfortunately there are two problems, firstly the
implementation is buggy we actually reread the list after each exec
which is quadratic in the number of commit lookups and secondly the
design is predicated on using nanosecond time stamps which are not the
default.

The implementation bug stems from the fact that we write a new todo
list to disk before running each command but do not update the stat
data to reflect this[1].

The design problem is that it is possible for the user to edit the
todo list without changing its size or inode which means we have to
rely on the mtime to tell us if it has changed. Unfortunately unless
git is built with USE_NSEC it is possible for the original and edited
list to share the same mtime.

Ideally "git rebase --edit-todo" would set a flag that we would then
check in sequencer.c. Unfortunately this is approach will not work as
there are scripts in the wild that write to the todo list directly
without running "git rebase --edit-todo". Instead of relying on stat
data this patch simply reads the possibly edited todo list and
compares it to the original with memcmp(). This is much faster than
reparsing the todo list each time. This patch reduces the time to run

   git rebase -r -xtrue v2.32.0~100 v2.32.0

which runs 419 exec commands by 6.6%. For comparison fixing the
implementation bug in stat based approach reduces the time by a
further 1.4% and is indistinguishable from never rereading the todo
list.

[1] https://lore.kernel.org/git/20191125131833.GD23183@szeder.dev/

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-24 08:56:28 -07:00
dfa8bae5a2 sequencer.c: factor out a function
This code is heavily indented and obscures the high level logic within
the loop. Let's move it to its own function before modifying it in the
next commit. Note that there is a subtle change in behavior if the
todo list cannot be reread. Previously todo_list->current was
incremented before returning, now it returns immediately.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-24 08:56:27 -07:00
0d0d8d8a11 doc/technical: update note about core.multiPackIndex
MIDX files are used by default since commit 18e449f86b
(midx: enable core.multiPackIndex by default, 2020-09-25)

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-24 08:39:53 -07:00
f53df0bdf6 Makefile: remove an out-of-date comment
This comment added in dfea575017 (Makefile: lazily compute header
dependencies, 2010-01-26) has been out of date since
92b88eba9f (Makefile: use `git ls-files` to list header files, if
possible, 2019-03-04), when we did exactly what it tells us not to do
and added $(GENERATED_H) to $(OBJECTS) dependencies.

The rest of it was also somewhere between inaccurate and outdated,
since as of b8ba629264 (Makefile: fold MISC_H into LIB_H, 2012-06-20)
it's not followed by a list of header files, that got moved earlier in
the file into LIB_H in 60d24dd255 (Makefile: fold XDIFF_H and VCSSVN_H
into LIB_H, 2012-07-06).

Let's just remove it entirely, to the extent that we have anything
useful to say here the comment on the
"USE_COMPUTED_HEADER_DEPENDENCIES" variable a few lines above this
change does the job for us.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 15:06:47 -07:00
7c81295382 Makefile: don't perform "mv $@+ $@" dance for $(GENERATED_H)
Change the "cmd.sh > $@+ && mv $@+ $@" pattern used for generating the
config-list.h and command-list.h to just "cmd.sh >$@". This was needed
as a guard to ensure that we don't have an empty file if the script
failed, but since 7b76d6bf22 (Makefile: add and use the
".DELETE_ON_ERROR" flag, 2021-06-29) GNU make ensures that doesn't
happen.

There's still a lot of other places in the Makefile where we
needlessly use this pattern, but I'm just changing these because I'm
about to add a new $(GENERATED_H) target, let's have them all look and
act the same way.

Even with ".DELETE_ON_ERROR" there is still a point to using the "mv
$@+ $@" pattern in some cases, e.g. to ensure that you have a working
binary during recompilation (see [1] for the start of a long
discussion about that), but that doesn't apply here. Nothing external
uses $(GENERATED_H) directly, it's only ever used in the context of
the Makefile's own dependency (re-)generation.

1. https://lore.kernel.org/git/8735t93h0u.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 15:06:47 -07:00
7c3c0a99cc Makefile: stop hardcoding {command,config}-list.h
Change various places that hardcode the names of these two files to
refer to either $(GENERATED_H), or to a new generated-hdrs
target. That target is consistent with the *-objs targets I recently
added in 029bac01a8 (Makefile: add {program,xdiff,test,git,fuzz}-objs
& objects targets, 2021-02-23).

A subsequent commit will add a new generated hook-list.h. By doing
this refactoring we'll only need to add the new file to the
GENERATED_H variable, not EXCEPT_HDRS, the vcbuild/README etc.

Hardcoding command-list.h there seems to have been a case of
copy/paste programming in 976aaedca0 (msvc: add a Makefile target to
pre-generate the Visual Studio solution, 2019-07-29). The
config-list.h was added later in 709df95b78 (help: move
list_config_help to builtin/help, 2020-04-16).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 15:06:47 -07:00
ea47e59fe3 Makefile: mark "check" target as .PHONY
Fix a bug in 44c9e8594e (Fix up header file dependencies and add
sparse checking rules, 2005-07-03), we never marked the phony "check"
target as such.

Perhaps we should just remove it, since as of a combination of
912f9980d2 (Makefile: help people who run 'make check' by mistake,
2008-11-11) 0bcd9ae85d (sparse: Fix errors due to missing
target-specific variables, 2011-04-21) we've been suggesting the user
run "make sparse" directly.

But under that mode it still does something, as well as directing the
user to run "make test" under non-sparse. So let's punt that and
narrowly fix the PHONY bug.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 15:06:47 -07:00
f188160be9 bundle: remove ignored & undocumented "--verbose" flag
In 73c3253d75 (bundle: framework for options before bundle file,
2019-11-10) the "git bundle" command was refactored to use
parse_options(). In that refactoring it started understanding the
"--verbose" flag before the subcommand, e.g.:

    git bundle --verbose verify --quiet

However, nothing ever did anything with this "verbose" variable, and
the change wasn't documented. It appears to have been something that
escaped the lab, and wasn't flagged by reviewers at the time. Let's
just remove it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 15:03:48 -07:00
ddb1055343 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 13:45:03 -07:00
b1b065ee35 Merge branch 'rs/use-xopen-in-index-pack'
Code clean-up.

* rs/use-xopen-in-index-pack:
  index-pack: use xopen in init_thread
2021-09-23 13:44:50 -07:00
d1e376d2f9 Merge branch 'kz/revindex-comment-fix'
Header comment fix.

* kz/revindex-comment-fix:
  pack-revindex.h: correct the time complexity descriptions
2021-09-23 13:44:49 -07:00
50eb005eb3 Merge branch 'cb/plug-leaks-in-alloca-emu-users'
Leakfix.

* cb/plug-leaks-in-alloca-emu-users:
  t0000: avoid masking git exit value through pipes
  tree-diff: fix leak when not HAVE_ALLOCA_H
2021-09-23 13:44:49 -07:00
f7511fdfbd Merge branch 'jt/submodule-name-to-gitdir'
Code refactoring.

* jt/submodule-name-to-gitdir:
  submodule: extract path to submodule gitdir func
2021-09-23 13:44:49 -07:00
188da7dc09 Merge branch 'ma/doc-git-version'
Doc update.

* ma/doc-git-version:
  documentation: add documentation for 'git version'
2021-09-23 13:44:49 -07:00
bd42622e5f Merge branch 'ma/help-w-check-for-requested-page'
The error in "git help no-such-git-command" is handled better.

* ma/help-w-check-for-requested-page:
  help: make sure local html page exists before calling external processes
2021-09-23 13:44:48 -07:00
c2e799012b Merge branch 'cb/unix-sockets-with-windows'
Adjust credential-cache helper to Windows.

* cb/unix-sockets-with-windows:
  git-compat-util: include declaration for unix sockets in windows
  credential-cache: check for windows specific errors
  t0301: fixes for windows compatibility
2021-09-23 13:44:48 -07:00
0e35107e7d Merge branch 'ab/retire-option-argument'
An oddball OPTION_ARGUMENT feature has been removed from the
parse-options API.

* ab/retire-option-argument:
  parse-options API: remove OPTION_ARGUMENT feature
  difftool: use run_command() API in run_file_diff()
  difftool: prepare "diff" cmdline in cmd_difftool()
  difftool: prepare "struct child_process" in cmd_difftool()
2021-09-23 13:44:48 -07:00
0a4cb1f1f2 Merge branch 'mr/bisect-in-c-4'
Rewrite of "git bisect" in C continues.

* mr/bisect-in-c-4:
  bisect--helper: retire `--bisect-next-check` subcommand
  bisect--helper: reimplement `bisect_run` shell function in C
  bisect--helper: reimplement `bisect_visualize()` shell function in C
  run-command: make `exists_in_PATH()` non-static
  t6030-bisect-porcelain: add test for bisect visualize
  t6030-bisect-porcelain: add tests to control bisect run exit cases
2021-09-23 13:44:48 -07:00
57e4a7b633 Merge branch 'ab/unused-script-helpers'
Code clean-up.

* ab/unused-script-helpers:
  test-lib: remove unused $_x40 and $_z40 variables
  git-bisect: remove unused SHA-1 $x40 shell variable
  git-sh-setup: remove unused "pull with rebase" message
  git-submodule: remove unused is_zero_oid() function
2021-09-23 13:44:47 -07:00
8f79fb6445 Merge branch 'ab/http-drop-old-curl-plus'
Conditional compilation around versions of libcURL has been
straightened out.

* ab/http-drop-old-curl-plus:
  http: don't hardcode the value of CURL_SOCKOPT_OK
  http: centralize the accounting of libcurl dependencies
  http: correct curl version check for CURLOPT_PINNEDPUBLICKEY
  http: correct version check for CURL_HTTP_VERSION_2
  http: drop support for curl < 7.18.0 (again)
  Makefile: drop support for curl < 7.9.8 (again)
  INSTALL: mention that we need libcurl 7.19.4 or newer to build
  INSTALL: reword and copy-edit the "libcurl" section
  INSTALL: don't mention the "curl" executable at all
2021-09-23 13:44:47 -07:00
68658a867d Merge branch 'po/git-config-doc-mentions-help-c'
Doc update.

* po/git-config-doc-mentions-help-c:
  doc: config, tell readers of `git help --config`
2021-09-23 13:44:47 -07:00
cabb41d0f6 Merge branch 'jk/http-server-protocol-versions'
Taking advantage of the CGI interface, http-backend has been
updated to enable protocol v2 automatically when the other side
asks for it.

* jk/http-server-protocol-versions:
  docs/protocol-v2: point readers transport config discussion
  docs/git: discuss server-side config for GIT_PROTOCOL
  docs/http-backend: mention v2 protocol
  http-backend: handle HTTP_GIT_PROTOCOL CGI variable
  t5551: test v2-to-v0 http protocol fallback
2021-09-23 13:44:47 -07:00
b5866edf97 Merge branch 'ab/gc-remove-unused-call'
Code clean-up.

* ab/gc-remove-unused-call:
  gc: remove unused launchctl_get_uid() call
2021-09-23 13:44:46 -07:00
ffb0387608 Merge branch 'ab/test-tool-run-command-cleanup'
Code clean-up.

* ab/test-tool-run-command-cleanup:
  test-tool run-command: fix flip-flop init pattern
2021-09-23 13:44:46 -07:00
b83e131029 Merge branch 'en/tests-cleanup-leftover-untracked'
Test clean-up.

* en/tests-cleanup-leftover-untracked:
  tests: remove leftover untracked files
2021-09-23 13:44:46 -07:00
91b2c79394 Merge branch 'jk/strvec-typefix'
Correct nr and alloc members of strvec struct to be of type size_t.

* jk/strvec-typefix:
  strvec: use size_t to store nr and alloc
2021-09-23 13:44:46 -07:00
e3b77a2d03 Merge branch 'rs/drop-core-compression-vars'
Code clean-up.

* rs/drop-core-compression-vars:
  compression: drop write-only core_compression_* variables
2021-09-23 13:44:46 -07:00
28caad63d0 Merge branch 'rs/packfile-bad-object-list-in-oidset'
Replace a handcrafted data structure used to keep track of bad
objects in the packfile API by an oidset.

* rs/packfile-bad-object-list-in-oidset:
  packfile: use oidset for bad objects
  packfile: convert has_packed_and_bad() to object_id
  packfile: convert mark_bad_packed_object() to object_id
  midx: inline nth_midxed_pack_entry()
  oidset: make oidset_size() an inline function
2021-09-23 13:44:46 -07:00
6c84b007c4 Merge branch 'en/am-abort-fix'
When "git am --abort" fails to abort correctly, it still exited
with exit status of 0, which has been corrected.

* en/am-abort-fix:
  am: fix incorrect exit status on am fail to abort
  t4151: add a few am --abort tests
  git-am.txt: clarify --abort behavior
2021-09-23 13:44:45 -07:00
06a0eeaa25 Merge branch 'ps/update-ref-batch-flush'
"git update-ref --stdin" failed to flush its output as needed,
which potentially led the conversation to a deadlock.

* ps/update-ref-batch-flush:
  t1400: avoid SIGPIPE race condition on fifo
  update-ref: fix streaming of status updates
2021-09-23 13:44:45 -07:00
956d2e4639 tests: add a test mode for SANITIZE=leak, run it in CI
While git can be compiled with SANITIZE=leak, we have not run
regression tests under that mode. Memory leaks have only been fixed as
one-offs without structured regression testing.

This change adds CI testing for it. We'll now build and small set of
whitelisted t00*.sh tests under Linux with a new job called
"linux-leaks".

The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
mode. When running in that mode, we'll assert that we were compiled
with SANITIZE=leak. We'll then skip all tests, except those that we've
opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".

A test setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in turn
make use of the "SANITIZE_LEAK" prerequisite, should they wish to
selectively skip tests even under
"GIT_TEST_PASSING_SANITIZE_LEAK=true". In the preceding commit we
started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".

This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
    1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set

The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations
as follow-up change, but let's start small to begin with.

In ci/run-build-and-tests.sh we make use of the default "*" case to
run "make test" without any GIT_TEST_* modes. SANITIZE=leak is known
to fail in combination with GIT_TEST_SPLIT_INDEX=true in
t0016-oidmap.sh, and we're likely to have other such failures in
various GIT_TEST_* modes. Let's focus on getting the base tests
passing, we can expand coverage to GIT_TEST_* modes later.

It would also be possible to implement a more lightweight version of
this by only relying on setting "LSAN_OPTIONS". See
<YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and
<YS9ZIDpANfsh7N+S@coredump.intra.peff.net>[2] for a discussion of
that. I've opted for this approach of adding a GIT_TEST_* mode instead
because it's consistent with how we handle other special test modes.

Being able to add a "!SANITIZE_LEAK" prerequisite and calling
"test_done" early if it isn't satisfied also means that we can more
incrementally add regression tests without being forced to fix
widespread and hard-to-fix leaks at the same time.

We have tests that do simple checking of some tool we're interested
in, but later on in the script might be stressing trace2, or common
sources of leaks like "git log" in combination with the tool (e.g. the
commit-graph tests). To be clear having a prerequisite could also be
accomplished by using "LSAN_OPTIONS" directly.

On the topic of "LSAN_OPTIONS": It would be nice to have a mode to
aggregate all failures in our various scripts, see [2] for a start at
doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
that for now, it can be added later.

As of writing this we've got major regressions between master..seen,
i.e. the t000*.sh tests and more fixed since 31f9acf9ce (Merge branch
'ah/plugleaks', 2021-08-04) have regressed recently.

See the discussion at <87czsv2idy.fsf@evledraar.gmail.com>[3] about
the lack of this sort of test mode, and 0e5bba53af (add UNLEAK
annotation for reducing leak false positives, 2017-09-08) for the
initial addition of SANITIZE=leak.

See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.

As noted in [5] we can't support this on OSX yet until Clang 14 is
released, at that point we'll probably want to resurrect that
"osx-leaks" job.

1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
3. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
4. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/
5. https://lore.kernel.org/git/20210916035603.76369-1-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 11:29:45 -07:00
2cdc292b31 Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
When SANITIZE=leak is specified we'll now add a SANITIZE_LEAK flag to
GIT-BUILD-OPTIONS, this can then be picked up by the test-lib.sh,
which sets a SANITIZE_LEAK prerequisite.

We can then skip specific tests that are known to fail under
SANITIZE=leak, add one such annotation to t0004-unwritable.sh, which
now passes under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 11:29:45 -07:00
77bd616367 Merge branch 'da/difftool-dir-diff-symlink-fix' into da/difftool
* da/difftool-dir-diff-symlink-fix:
  difftool: fix symlink-file writing in dir-diff mode
2021-09-23 11:26:17 -07:00
5bafb3576a difftool: fix symlink-file writing in dir-diff mode
The difftool dir-diff mode handles symlinks by replacing them with their
readlink(2) values. This allows diff tools to see changes to symlinks
as if they were regular text diffs with the old and new path values.
This is analogous to what "git diff" displays when symlinks change.

The temporary diff directories that are created initially contain
symlinks because they get checked-out using a temporary index that
retains the original symlinks as checked-in to the repository.

A bug was introduced when difftool was rewritten in C that made
difftool write the readlink(2) contents into the pointed-to file rather
than the symlink itself. The write was going through the symlink and
writing to its target rather than writing to the symlink path itself.

Replace symlinks with raw text files by unlinking the symlink path
before writing the readlink(2) content into them.

When 18ec800512 (difftool: handle modified symlinks in dir-diff mode,
2017-03-15) added handling for modified symlinks this bug got recorded
in the test suite. The tests included the pointed-to symlink target
paths. These paths were being reported because difftool was erroneously
writing to them, but they should have never been reported nor written.

Correct the modified-symlinks test cases by removing the target files
from the expected output.

Add a test to ensure that symlinks are written with the readlink(2)
values and that the target files contain their original content.

Reported-by: Alan Blotz <work@blotz.org>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 11:24:41 -07:00
06fa4db3f7 help: move column config discovery to help.c library
When a git_config() call was added in dbfae68969 (help: reuse
print_columns() for help -a, 2012-04-13) to read the column config
we'd always use the resulting "colopts" variable.

Then in 63eae83f8f (help: add "-a --verbose" to list all commands
with synopsis, 2018-05-20) we started only using the "colopts" config
under "--all" if "--no-verbose" was also given, but the "git_config()"
call was not moved inside the "verbose" branch of the code.

This change effectively does that, we'll only call list_commands()
under "--all --no-verbose", so let's have it look up the config it
needs. See 26c7d06783 (help -a: improve and make --verbose default, 2018-09-29) for another case in help.c where we look up config.

The get_colopts() function is named for consistency with the existing
get_alias() function added in 26c7d06783.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 10:30:43 -07:00
a9bacccae5 help / completion: make "git help" do the hard work
The "help" builtin has been able to emit configuration variables since
e17ca92637 (completion: drop the hard coded list of config vars,
2018-05-26), but it hasn't produced exactly the format the completion
script wanted. Let's do that.

We got partway there in 2675ea1cc0 (completion: use 'sort -u' to
deduplicate config variable names, 2019-08-13) and
d9438873c4 (completion: deduplicate configuration sections,
2019-08-13), but after both we still needed some sorting,
de-duplicating and awk post-processing of the list.

We can instead simply do the relevant parsing ourselves (we were doing
most of it already), and call string_list_remove_duplicates() after
already sorting the list, so the caller doesn't need to invoke "sort
-u". The "--config-for-completion" output is the same as before after
being passed through "sort -u".

Then add a new "--config-sections-for-completion" option. Under that
output we'll emit config sections like "alias" (instead of "alias." in
the --config-for-completion output).

We need to be careful to leave the "--config-for-completion" option
compatible with users git, but are still running a shell with an older
git-completion.bash. If we e.g. changed the option name they'd see
messages about git-completion.bash being unable to find the
"--config-for-completion" option.

Such backwards compatibility isn't something we should bend over
backwards for, it's only helping users who:

 * Upgrade git
 * Are in an old shell
 * The git-completion.bash in that shell hasn't cached the old
   "--config-for-completion" output already.

But since it's easy in this case to retain compatibility, let's do it,
the older versions of git-completion.bash won't care that the input
they get doesn't change after a "sort -u".

While we're at it let's make "--config-for-completion" die if there's
anything left over in "argc", and do the same in the new
"--config-sections-for-completion" option.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 10:30:43 -07:00
5a5f04d86b help tests: test --config-for-completion option & output
Add a regression test for the --config-for-completion option, this was
tested for indirectly with the test added in 7a09a8f093 (completion:
add tests for 'git config' completion, 2019-08-13), but let's do it
directly here as well.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 10:30:43 -07:00
d35d03cf93 help: simplify by moving to OPT_CMDMODE()
As preceding commits have incrementally established all of the --all,
--guides, --config and hidden --config-for-completion options are
mutually exclusive. So let's use OPT_CMDMODE() to parse the
command-line instead, and take advantage of its conflicting options
detection.

This is the first command with a hidden CMDMODE, so let's introduce a
OPT_CMDMODE_F() macro to go along with OPT_CMDMODE().

I think this makes the usage information that we emit slightly worse,
e.g. before we'd emit:

    $ git help --all --config
    fatal: --config and --all cannot be combined

    usage: git help [-a|--all] [--[no-]verbose]]
             [[-i|--info] [-m|--man] [-w|--web]] [<command>]
       or: git help [-g|--guides]
       or: git help [-c|--config]
    [...]
    $

And now:

    $ git help --all --config
    error: option `config' is incompatible with --all
    $

But improving that is a general topic for parse-options.c improvement,
i.e. we should probably emit the full usage in that case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 10:30:43 -07:00
0a5940fbe7 help: correct logic error in combining --all and --guides
The --all and --guides commands could be combined, which wouldn't have
any impact on the output except for:

    git help --all --guides --no-verbose

Listing the guide alongside that output was clearly not intended, so
let's error out here. See 002b726a40 (builtin/help.c: add
list_common_guides_help() function, 2013-04-02) for the initial
implementation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 10:30:43 -07:00
1ed4bef6b4 help: correct logic error in combining --all and --config
Fix a bug in the --config option that's been there ever since its
introduction in 3ac68a93fd (help: add --config to list all available
config, 2018-05-26). Die when --all and --config are combined,
combining them doesn't make sense.

The code for the --config option when combined with an earlier
refactoring done to support the --guide option in
65f98358c0 (builtin/help.c: add --guide option, 2013-04-02) would
cause us to take the "--all" branch early and ignore the --config
option.

Let's instead list these as incompatible, both in the synopsis and
help output, and enforce it in the code itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 10:30:43 -07:00
ff76fc841f help tests: add test for --config output
Add a missing test for checking what the --config output added in
ac68a93fd2 (help: add --config to list all available config,
2018-05-26) looks like. We should not be emitting anything except
config variables and the brief usage information at the end here.

The second test regexp here might not match three-level variables in
general, as their second level could contain ".", but in this case
we're always emitting what we extract from the documentation, so it's
all strings like:

    foo.<name>.bar

If we did introduce something like variable example content here we'd
like this to break, since we'd then be likely to break the
git-completion.bash.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 10:30:43 -07:00
9856ea6785 help: correct usage & behavior of "git help --guides"
As noted in 65f98358c0 (builtin/help.c: add --guide option,
2013-04-02) and a133737b80 (doc: include --guide option description
for "git help", 2013-04-02) which introduced the --guide option, it
cannot be combined with e.g. <command>.

Change the command and the "SYNOPSIS" section to reflect that desired
behavior. Now that we assert this in code we don't need to
exhaustively describe the previous confusing behavior in the
documentation either, instead of silently ignoring the provided
argument we'll now error out.

The "We're done. Ignore any remaining args" comment added in
15f7d49438 (builtin/help.c: split "-a" processing into two,
2013-04-02) can now be removed, it's obvious that we're asserting the
behavior with the check of "argc".

The "--config" option is still missing from the synopsis, it will be
added in a subsequent commit where we'll fix bugs in its
implementation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 10:30:43 -07:00
28ecef4c84 Merge branch 'jk/grep-haystack-is-read-only' into hm/paint-hits-in-log-grep
* jk/grep-haystack-is-read-only:
  grep: store grep_source buffer as const
  grep: mark "haystack" buffers as const
  grep: stop modifying buffer in grep_source_1()
  grep: stop modifying buffer in show_line()
  grep: stop modifying buffer in strip_timestamp
2021-09-23 09:47:05 -07:00
731b6859c4 Makefile: make COMPUTE_HEADER_DEPENDENCIES=auto work with DEVOPTS=pedantic
The "COMPUTE_HEADER_DEPENDENCIES" feature added in [1] was extended to
use auto-detection in [2], that "auto" detection has always piped
STDERR to /dev/null, so any failures on compilers that didn't support
these GCC flags would silently fall back to
"COMPUTE_HEADER_DEPENDENCIES=no".

Later when -Wpedantic support was added to DEVOPTS in [3] we started
passing -Wpedantic in combination with -Werror to the compiler
here. Note (to the pedantic): [3] actually passed "-pedantic", but it
and "-Wpedantic" are synonyms.

Turning on -Wpedantic in [3] broke the auto-detection, since this
relies on compiling an empty program. GCC would loudly complain on
STDERR:

    /dev/null:1: error: ISO C forbids an empty translation unit
    [-Werror=pedantic]
    cc1: note: unrecognized command-line option
    ‘-Wno-pedantic-ms-format’ may have been intended to silence
    earlier diagnostics
    cc1: all warnings being treated as errors

But as that ended up in the "$(dep_check)" variable due to the "2>&1"
in [2] we didn't see it.

Then when [4] made DEVOPTS=pedantic the default specifying
"DEVELOPER=1" would effectively set "COMPUTE_HEADER_DEPENDENCIES=no".

To fix these issues let's unconditionally pass -Wno-pedantic after
$(ALL_CFLAGS), we might get a -Wpedantic via config.mak.dev after, or
the builder might specify it via CFLAGS. In either case this will undo
current and future problems with -Wpedantic.

I think it would make sense to simply remove the "2>&1", it would mean
that anyone using a non-GCC-like compiler would get warnings under
COMPUTE_HEADER_DEPENDENCIES=auto, e.g on AIX's xlc would emit:

    /opt/IBM/xlc/13.1.3/bin/.orig/xlc: 1501-208 (S) command option D is missing a subargument
    Non-zero 40 exit with COMPUTE_HEADER_DEPENDENCIES=auto, set it to "yes" or "no" to quiet auto-detect

And on Solaris with SunCC:

    cc: Warning: Option -x passed to ld, if ld is invoked, ignored otherwise
    cc: refused to overwrite input file by output file: /dev/null
    cc: Warning: Option -x passed to ld, if ld is invoked, ignored otherwise
    cc: refused to overwrite input file by output file: /dev/null
    Non-zero 1 exit with COMPUTE_HEADER_DEPENDENCIES=auto, set it to "yes" or "no" to quiet auto-detect

Both could be quieted by setting COMPUTE_HEADER_DEPENDENCIES=no
explicitly, as suggested, but let's see if this'll fix it without
emitting too much noise at those that aren't using "gcc" or "clang".

1. f2fabbf76e (Teach Makefile to check header dependencies,
   2010-01-26)
2. 111ee18c31 (Makefile: Use computed header dependencies if the
   compiler supports it, 2011-08-18)
3. 729b3925ed (Makefile: add a DEVOPTS flag to get pedantic
   compilation, 2018-07-24)
4. 6a8cbc41ba (developer: enable pedantic by default, 2021-09-03)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 21:38:57 -07:00
c234e8a0ec Makefile: make the "sparse" target non-.PHONY
Change the "sparse" target and its *.sp dependencies to be
non-.PHONY. Before this change "make sparse" would take ~5s to re-run
all the *.c files through "cgcc", after it it'll create an empty *.sp
file sitting alongside the *.c file, only if the *.c file or its
dependencies are newer than the *.sp is the *.sp re-made.

We ensure that the recursive dependencies are correct by depending on
the *.o file, which in turn will have correct dependencies by either
depending on all header files, or under
"COMPUTE_HEADER_DEPENDENCIES=yes" the headers it needs.

This means that a plain "make sparse" is much slower, as we'll now
need to make the *.o files just to create the *.sp files, but
incrementally creating the *.sp files is *much* faster and less
verbose, it thus becomes viable to run "sparse" along with "all" as
e.g. "git rebase --exec 'make all sparse'".

On my box with -j8 "make sparse" was fast before, or around 5 seconds,
now it only takes that long the first time, and the common case is
<100ms, or however long it takes GNU make to stat the *.sp file and
see that all the corresponding *.c file and its dependencies are
older.

See 0bcd9ae85d (sparse: Fix errors due to missing target-specific
variables, 2011-04-21) for the modern implementation of the sparse
target being changed here.

It is critical that we use -Wsparse-error here, otherwise the error
would only show up once, but we'd successfully create the empty *.sp
file, and running a second time wouldn't show the error. I'm therefore
not putting it into SPARSE_FLAGS or SP_EXTRA_FLAGS, it's not optional,
the Makefile logic won't behave properly without it.

Appending to $@ without a move is OK here because we're using the
.DELETE_ON_ERROR Makefile feature. See 7b76d6bf22 (Makefile: add and
use the ".DELETE_ON_ERROR" flag, 2021-06-29). GNU make ensures that on
error this file will be removed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 21:36:11 -07:00
b66c77a64e http: match headers case-insensitively when redacting
When HTTP/2 is in use, we fail to correctly redact "Authorization" (and
other) headers in our GIT_TRACE_CURL output.

We get the headers in our CURLOPT_DEBUGFUNCTION callback, curl_trace().
It passes them along to curl_dump_header(), which in turn checks
redact_sensitive_header(). We see the headers as a text buffer like:

  Host: ...
  Authorization: Basic ...

After breaking it into lines, we match each header using skip_prefix().
This is case-sensitive, even though HTTP headers are case-insensitive.
This has worked reliably in the past because these headers are generated
by curl itself, which is predictable in what it sends.

But when HTTP/2 is in use, instead we get a lower-case "authorization:"
header, and we fail to match it. The fix is simple: we should match with
skip_iprefix().

Testing is more complicated, though. We do have a test for the redacting
feature, but we don't hit the problem case because our test Apache setup
does not understand HTTP/2. You can reproduce the issue by applying this
on top of the test change in this patch:

	diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
	index afa91e38b0..19267c7107 100644
	--- a/t/lib-httpd/apache.conf
	+++ b/t/lib-httpd/apache.conf
	@@ -29,6 +29,9 @@ ErrorLog error.log
	 	LoadModule setenvif_module modules/mod_setenvif.so
	 </IfModule>

	+LoadModule http2_module modules/mod_http2.so
	+Protocols h2c
	+
	 <IfVersion < 2.4>
	 LockFile accept.lock
	 </IfVersion>
	@@ -64,8 +67,8 @@ LockFile accept.lock
	 <IfModule !mod_access_compat.c>
	 	LoadModule access_compat_module modules/mod_access_compat.so
	 </IfModule>
	-<IfModule !mod_mpm_prefork.c>
	-	LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
	+<IfModule !mod_mpm_event.c>
	+	LoadModule mpm_event_module modules/mod_mpm_event.so
	 </IfModule>
	 <IfModule !mod_unixd.c>
	 	LoadModule unixd_module modules/mod_unixd.so
	diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
	index 1c2a444ae7..ff74f0ae8a 100755
	--- a/t/t5551-http-fetch-smart.sh
	+++ b/t/t5551-http-fetch-smart.sh
	@@ -24,6 +24,10 @@ test_expect_success 'create http-accessible bare repository' '
	 	git push public main:main
	 '

	+test_expect_success 'prefer http/2' '
	+	git config --global http.version HTTP/2
	+'
	+
	 setup_askpass_helper

	 test_expect_success 'clone http repository' '

but this has a few issues:

  - it's not necessarily portable. The http2 apache module might not be
    available on all systems. Further, the http2 module isn't compatible
    with the prefork mpm, so we have to switch to something else. But we
    don't necessarily know what's available. It would be nice if we
    could have conditional config, but IfModule only tells us if a
    module is already loaded, not whether it is available at all.

    This might be a non-issue. The http tests are already optional, and
    modern-enough systems may just have both of these. But...

  - if we do this, then we'd no longer be testing HTTP/1.1 at all. I'm
    not sure how much that matters since it's all handled by curl under
    the hood, but I'd worry that some detail leaks through. We'd
    probably want two scripts running similar tests, one with HTTP/2 and
    one with HTTP/1.1.

  - speaking of which, a later test fails with the patch above! The
    problem is that it is making sure we used a chunked
    transfer-encoding by looking for that header in the trace. But
    HTTP/2 doesn't support that, as it has its own streaming mechanisms
    (the overall operation works fine; we just don't see the header in
    the trace).

Furthermore, even with the changes above, this test still does not
detect the current failure, because we see _both_ HTTP/1.1 and HTTP/2
requests, which confuse it. Quoting only the interesting bits from the
resulting trace file, we first see:

  => Send header: GET /auth/smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
  => Send header: Connection: Upgrade, HTTP2-Settings
  => Send header: Upgrade: h2c
  => Send header: HTTP2-Settings: AAMAAABkAAQCAAAAAAIAAAAA

  <= Recv header: HTTP/1.1 401 Unauthorized
  <= Recv header: Date: Wed, 22 Sep 2021 20:03:32 GMT
  <= Recv header: Server: Apache/2.4.49 (Debian)
  <= Recv header: WWW-Authenticate: Basic realm="git-auth"

So the client asks for HTTP/2, but Apache does not do the upgrade for
the 401 response. Then the client repeats with credentials:

  => Send header: GET /auth/smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
  => Send header: Authorization: Basic <redacted>
  => Send header: Connection: Upgrade, HTTP2-Settings
  => Send header: Upgrade: h2c
  => Send header: HTTP2-Settings: AAMAAABkAAQCAAAAAAIAAAAA

  <= Recv header: HTTP/1.1 101 Switching Protocols
  <= Recv header: Upgrade: h2c
  <= Recv header: Connection: Upgrade
  <= Recv header: HTTP/2 200
  <= Recv header: content-type: application/x-git-upload-pack-advertisement

So the client does properly redact there, because we're speaking
HTTP/1.1, and the server indicates it can do the upgrade. And then the
client will make further requests using HTTP/2:

  => Send header: POST /auth/smart/repo.git/git-upload-pack HTTP/2
  => Send header: authorization: Basic dXNlckBob3N0OnBhc3NAaG9zdA==
  => Send header: content-type: application/x-git-upload-pack-request

And there we can see that the credential is _not_ redacted. This part of
the test is what gets confused:

	# Ensure that there is no "Basic" followed by a base64 string, but that
	# the auth details are redacted
	! grep "Authorization: Basic [0-9a-zA-Z+/]" trace &&
	grep "Authorization: Basic <redacted>" trace

The first grep does not match the un-redacted HTTP/2 header, because
it insists on an uppercase "A". And the second one does find the
HTTP/1.1 header. So as far as the test is concerned, everything is OK,
but it failed to notice the un-redacted lines.

We can make this test (and the other related ones) more robust by adding
"-i" to grep case-insensitively. This isn't really doing anything for
now, since we're not actually speaking HTTP/2, but it future-proofs the
tests for a day when we do (either we add explicit HTTP/2 test support,
or it's eventually enabled by default by our Apache+curl test setup).
And it doesn't hurt in the meantime for the tests to be more careful.

The change to use "grep -i", coupled with the changes to use HTTP/2
shown above, causes the test to fail with the current code, and pass
after this patch is applied.

And finally, there's one other way to demonstrate the issue (and how I
actually found it originally). Looking at GIT_TRACE_CURL output against
github.com, you'll see the unredacted output, even if you didn't set
http.version. That's because setting it is only necessary for curl to
send the extra headers in its HTTP/1.1 request that say "Hey, I speak
HTTP/2; upgrade if you do, too". But for a production site speaking
https, the server advertises via ALPN, a TLS extension, that it supports
HTTP/2, and the client can immediately start using it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 21:24:58 -07:00
6295f87b5f Merge branch 'jt/add-submodule-odb-clean-up' into jt/no-abuse-alternate-odb-for-submodules
* jt/add-submodule-odb-clean-up:
  revision: remove "submodule" from opt struct
  repository: support unabsorbed in repo_submodule_init
  submodule: remove unnecessary unabsorbed fallback
2021-09-22 17:11:09 -07:00
51b04c05b7 difftool: fix word spacing in the usage strings
Remove spaces in `non - zero` and add a space between the diff
format/mode and option parentheses in difftool's usage strings.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 17:09:02 -07:00
2e54907e83 test-lib tests: get rid of copy/pasted mock test code
Now that we've split up the write_sub_test_lib_test*() and
run_sub_test_lib_test*() functions let's fix those tests in
t0000-basic.sh that were verbosely copy/pasting earlier tests.

That we caught all of them was asserted with a follow-up change that's
not part of this series[1], we might add such a duplication check at
some later time, but for now let's just one-off remove the duplicate
boilerplate.

1. https://lore.kernel.org/git/patch-v3-6.9-bc79b29f3c-20210805T103237Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 16:22:41 -07:00
56722a0635 test-lib tests: assert 1 exit code, not non-zero
Improve the testing for test-lib.sh itself to assert that we have a
exit code of 1, not any non-zero. Improves code added in
0445e6f0a1 (test-lib: '--run' to run only specific tests,
2014-04-30).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 16:22:41 -07:00
e07b817cfc test-lib tests: refactor common part of check_sub_test_lib_test*()
Refactor the two check_sub_test_lib_test*() functions to avoid
duplicating the same comparison they did of stdout. This duplication
was initially added when check_sub_test_lib_test_err() was added in
0445e6f0a1 (test-lib: '--run' to run only specific tests,
2014-04-30).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 16:22:41 -07:00
12fe4909fa test-lib tests: avoid subshell for "test_cmp" for readability
The use of a sub-shell for running the test_cmp of stdout/stderr for
the test author was introduced in this form in 565b6fa87b (tests:
refactor mechanics of testing in a sub test-lib, 2012-12-16), but from
looking at the history that seemed to have diligently copied my
original ad-hoc implementation in 7b90511970 (t/t0000-basic.sh: Run
the passing TODO test inside its own test-lib, 2010-08-19).

There's no reason to use a subshell here, we try to avoid it in
general. It also improves readability, if the test fails we print out
the relative path in the trash directory that needs to be looked
at.

Before that was mostly obscured, since the "write_sub_test_lib_test"
will pick the directory for you from the test name.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 16:22:41 -07:00
c3ff7be6fb test-lib tests: don't provide a description for the sub-tests
Change the $test_description provided for the generated subtests to be
constant, since the only purpose of having it is that test-lib.sh will
barf if it isn't supplied.

The other purpose of having it was to effectively split up the test
description between the argument to test_expect_success and the
argument to "write_and_run_sub_test_lib_test". Let's only use one of
the two.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 16:22:41 -07:00
9f0a45208d test-lib tests: split up "write and run" into two functions
Refactor the function to write and run tests of the test-lib.sh output
into two functions.

When this was added back in 565b6fa87b (tests: refactor mechanics of
testing in a sub test-lib, 2012-12-16) there was no reason to do this,
but since we started supporting test arguments in
517cd55fd5 (test-lib: self-test that --verbose works, 2013-06-23)
we've started to write out duplicate tests simply to test different
arguments, now we'll be able to re-use them.

This change doesn't consolidate any of those tests yet, it just makes
it possible to do so. All the changes in t0000-basic.sh are a simple
search-replacement.

Since the _run_sub_test_lib_test_common() function doesn't handle
running the test anymore we can do away with the sub-shell, which was
used to scope an "unset" and "export" shell variables.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 16:22:41 -07:00
4631cfc20b parse-options: properly align continued usage output
Some commands such as "git stash" emit continued options output with
e.g. "git stash -h", because usage_with_options_internal() prefixes
with its own whitespace the resulting output wasn't properly
aligned. Let's account for the added whitespace, which properly aligns
the output.

The "git stash" command has usage output with a N_() translation that
legitimately stretches across multiple lines;

	N_("git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]\n"
	   "          [-u|--include-untracked] [-a|--all] [-m|--message <message>]\n"
           [...]

We'd like to have that output aligned with the length of the initial
"git stash " output, but since usage_with_options_internal() adds its
own whitespace prefixing we fell short, before this change we'd emit:

    $ git stash -h
    usage: git stash list [<options>]
       or: git stash show [<options>] [<stash>]
       [...]
       or: git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
              [-u|--include-untracked] [-a|--all] [-m|--message <message>]
              [...]

Now we'll properly emit aligned output.  I.e. the last four lines
above will instead be (a whitespace-only change to the above):

       [...]
       or: git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
                     [-u|--include-untracked] [-a|--all] [-m|--message <message>]
                     [...]

We could also go for an approach where we have the caller support no
padding of their own, i.e. (same as the first example, except for the
padding on the second line):

	N_("git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]\n"
	   "[-u|--include-untracked] [-a|--all] [-m|--message <message>]\n"
           [...]

But to do that we'll need to find the length of "git stash". We can
discover that from the "cmd" in the "struct cmd_struct", but there
might be cases with sub-commands or "git" itself taking arguments that
would make that non-trivial.

Even if it were I still think this approach is better, because this way
we'll get the same legible alignment in the C code. The fact that
usage_with_options_internal() is adding its own prefix padding is an
implementation detail that callers shouldn't need to worry about.

Implementation notes:

We could skip the string_list_split() with a strchr(str, '\n') check,
but we'd then need to duplicate our state machine for strings that do
and don't contain a "\n". It's simpler to just always split into a
"struct string_list", even though the common case is that that "struct
string_list" will contain only one element. This is not
performance-sensitive code.

This change is relatively more complex since I've accounted for making
it future-proof for RTL translation support. Later in
usage_with_options_internal() we have some existing padding code
dating back to d7a38c54a6 (parse-options: be able to generate usages
automatically, 2007-10-15) which isn't RTL-safe, but that code would
be easy to fix. Let's not introduce new RTL translation problems here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 16:17:45 -07:00
1cc31e1529 MyFirstContribution: Document --range-diff option when writing v2
In the "Sending v2" section, readers are directed to create v2 patches
without using --range-diff. However, it is customary to include a
range-diff against the v1 patches as a reviewer aid.

Update the "Sending v2" section to suggest a simple workflow that uses
the --range-diff option. Also include some explanation for -v2 and
--range-diff to help the reader understand the importance.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 14:25:05 -07:00
f0a74bcb03 Makefile: clean .depend dirs under COMPUTE_HEADER_DEPENDENCIES != yes
Fix a logic error in dfea575017 (Makefile: lazily compute header
dependencies, 2010-01-26) where we'd make whether we cleaned the
.depend dirs contingent on the currently configured
COMPUTE_HEADER_DEPENDENCIES value. Before this running e.g.:

    make COMPUTE_HEADER_DEPENDENCIES=yes grep.o
    make COMPUTE_HEADER_DEPENDENCIES=no clean

Would leave behind the .depend directory, now it'll be removed.

Normally we'd need to use another variable, but in this case there's
no other uses of $(dep_dirs), as opposed to $(dep_args) which is used
as an argument to $(CC). So just deleting this line makes everything
work correctly.

See http://lore.kernel.org/git/xmqqmto48ufz.fsf@gitster.g for a report
about this issue.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 13:34:34 -07:00
f9d65b04cd t/perf/run: fix bin-wrappers computation
The GIT_TEST_INSTALLED was moved from perf-lib.sh to run in df0f5021
(perf-lib.sh: remove GIT_TEST_INSTALLED from perf-lib.sh, 2019-05-07)
and that included a change to how it inspected the existence of a
bin-wrappers directory. However, that included a typo that made the
match of bin-wrappers never work. Specifically, the assignment was

	mydir_abs_wrappers="$mydir_abs_wrappers/bin-wrappers"

which uses the same variable before it is initialized. By changing it to

	mydir_abs_wrappers="$mydir_abs/bin-wrappers"

We can correctly use the bin-wrappers directory.

This is critical to successfully computing performance of commands that
execute subcommands. The bin-wrappers ensure that the --exec-path is set
correctly.

Reported-by: Victoria Dye <vdye@github.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 13:26:11 -07:00
c21919f1b2 repository.h: don't use a mix of int and bitfields
Change the bitfield added in 58300f4743 (sparse-index: add
index.sparse config option, 2021-03-30) and 3964fc2aae (sparse-index:
add guard to ensure full index, 2021-03-30) to just use an "int"
boolean instead.

It might be smart to optimize the space here in the future, but by
consistently using an "int" we can take its address and pass it to
repo_cfg_bool(), and therefore don't need to handle "sparse_index" as
a special-case when reading the "index.sparse" setting.

There's no corresponding config for "command_requires_full_index", but
let's change it too for consistency and to prevent future bugs
creeping in due to one of these being "unsigned".

Using "int" consistently also prevents subtle bugs or undesired
control flow creeping in here. Before the preceding commit the
initialization of "command_requires_full_index" in
prepare_repo_settings() did nothing, i.e. this:

    r->settings.command_requires_full_index = 1

Was redundant to the earlier memset() to -1. Likewise for
"sparse_index" added in 58300f4743 (sparse-index: add index.sparse
config option, 2021-03-30) the code and comment added there was
misleading, we weren't initializing it to off, but re-initializing it
from "1" to "0", and then finally checking the config, and perhaps
setting it to "1" again. I.e. we could have applied this patch before
the preceding commit:

	+	assert(r->settings.command_requires_full_index == 1);
	 	r->settings.command_requires_full_index = 1;

	 	/*
	 	 * Initialize this as off.
	 	 */
	+	assert(r->settings.sparse_index == 1);
	 	r->settings.sparse_index = 0;
	 	if (!repo_config_get_bool(r, "index.sparse", &value) && value)
	 		r->settings.sparse_index = 1;

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 13:15:00 -07:00
3050b6dfc7 repo-settings.c: simplify the setup
Simplify the setup code in repo-settings.c in various ways, making the
code shorter, easier to read, and requiring fewer hacks to do the same
thing as it did before:

Since 7211b9e753 (repo-settings: consolidate some config settings,
2019-08-13) we have memset() the whole "settings" structure to -1 in
prepare_repo_settings(), and subsequently relied on the -1 value.

Most of the fields did not need to be initialized to -1, and because
we were doing that we had the enum labels "UNTRACKED_CACHE_UNSET" and
"FETCH_NEGOTIATION_UNSET" purely to reflect the resulting state
created this memset() in prepare_repo_settings(). No other code used
or relied on them, more on that below.

For the rest most of the subsequent "are we -1, then read xyz" can
simply be removed by re-arranging what we read first. E.g. when
setting the "index.version" setting we should have first read
"feature.experimental", so that it (and "feature.manyfiles") can
provide a default for our "index.version".

Instead the code setting it, added when "feature.manyFiles"[1] was
created, was using the UPDATE_DEFAULT_BOOL() macro added in an earlier
commit[2]. That macro is now gone, since it was only needed for this
pattern of reading things in the wrong order.

This also fixes an (admittedly obscure) logic error where we'd
conflate an explicit "-1" value in the config with our own earlier
memset() -1.

We can also remove the UPDATE_DEFAULT_BOOL() wrapper added in
[3]. Using it is redundant to simply using the return value from
repo_config_get_bool(), which is non-zero if the provided key exists
in the config.

Details on edge cases relating to the memset() to -1, continued from
"more on that below" above:

 * UNTRACKED_CACHE_KEEP:

   In [4] the "unset" and "keep" handling for core.untrackedCache was
   consolidated. But it while we understand the "keep" value, we don't
   handle it differently than the case of any other unknown value.

   So let's retain UNTRACKED_CACHE_KEEP and remove the
   UNTRACKED_CACHE_UNSET setting (which was always implicitly
   UNTRACKED_CACHE_KEEP before). We don't need to inform any code
   after prepare_repo_settings() that the setting was "unset", as far
   as anyone else is concerned it's core.untrackedCache=keep. if
   "core.untrackedcache" isn't present in the config.

 * FETCH_NEGOTIATION_UNSET & FETCH_NEGOTIATION_NONE:

   Since these two two enum fields added in [5] don't rely on the
   memzero() setting them to "-1" anymore we don't have to provide
   them with explicit values.

1. c6cc4c5afd (repo-settings: create feature.manyFiles setting,
   2019-08-13)
2. 31b1de6a09 (commit-graph: turn on commit-graph by default,
   2019-08-13)
3. 31b1de6a09 (commit-graph: turn on commit-graph by default,
   2019-08-13)
4. ad0fb65999 (repo-settings: parse core.untrackedCache,
   2019-08-13)
5. aaf633c2ad (repo-settings: create feature.experimental setting,
   2019-08-13)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 13:15:00 -07:00
f1bee82873 read-cache & fetch-negotiator: check "enum" values in switch()
Change tweak_untracked_cache() in "read-cache.c" to use a switch() to
have the compiler assert that we checked all possible values in the
"enum untracked_cache_setting" type, and likewise remove the "default"
case in fetch_negotiator_init() in favor of checking for
"FETCH_NEGOTIATION_UNSET" and "FETCH_NEGOTIATION_NONE".

As will be discussed in a subsequent we'll only ever have either of
these set to FETCH_NEGOTIATION_NONE, FETCH_NEGOTIATION_UNSET and
UNTRACKED_CACHE_UNSET within the prepare_repo_settings() function
itself. In preparation for fixing that code let's add a BUG() here to
mark this as unreachable code.

See ad0fb65999 (repo-settings: parse core.untrackedCache, 2019-08-13)
for when the "unset" and "keep" handling for core.untrackedCache was
consolidated, and aaf633c2ad (repo-settings: create
feature.experimental setting, 2019-08-13) for the addition of the
"default" pattern in "fetch-negotiator.c".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 13:15:00 -07:00
c6b4888b3f environment.c: remove test-specific "ignore_untracked..." variable
Instead of the global ignore_untracked_cache_config variable added in
dae6c322fa (test-dump-untracked-cache: don't modify the untracked
cache, 2016-01-27) we can make use of the new facility to set config
via environment variables added in d8d77153ea (config: allow
specifying config entries via envvar pairs, 2021-01-12).

It's arguably a bit hacky to use setenv() and getenv() to pass
messages between the same program, but since the test helpers are not
the main intended audience of repo-settings.c I think it's better than
hardcoding the test-only special-case in prepare_repo_settings().

This uses the xsetenv() wrapper added in the preceding commit, if we
don't set these in the environment we'll fail in
t7063-status-untracked-cache.sh, but let's fail earlier anyway if that
were to happen.

This breaks any parent process that's potentially using the
GIT_CONFIG_* and GIT_CONFIG_PARAMETERS mechanism to pass one-shot
config setting down to a git subprocess, but in this case we don't
care about the general case of such potential parents. This process
neither spawns other "git" processes, nor is it interested in other
configuration. We might want to pick up other test modes here, but
those will be passed via GIT_TEST_* environment variables.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 13:15:00 -07:00
3540c71ea5 wrapper.c: add x{un,}setenv(), and use xsetenv() in environment.c
Add fatal wrappers for setenv() and unsetenv(). In d7ac12b25d (Add
set_git_dir() function, 2007-08-01) we started checking its return
value, and since 48988c4d0c (set_git_dir: die when setenv() fails,
2018-03-30) we've had set_git_dir_1() die if we couldn't set it.

Let's provide a wrapper for both, this will be useful in many other
places, a subsequent patch will make another use of xsetenv().

The checking of the return value here is over-eager according to
setenv(3) and POSIX. It's documented as returning just -1 or 0, so
perhaps we should be checking -1 explicitly.

Let's just instead die on any non-zero, if our C library is so broken
as to return something else than -1 on error (and perhaps not set
errno?) the worst we'll do is die with a nonsensical errno value, but
we'll want to die in either case.

Let's make these return "void" instead of "int". As far as I can tell
there's no other x*() wrappers that needed to make the decision of
deviating from the signature in the C library, but since their return
value is only used to indicate errors (so we'd die here), we can catch
unreachable code such as

    if (xsetenv(...) < 0)
        [...];

I think it would be OK skip the NULL check of the "name" here for the
calls to die_errno(). Almost all of our setenv() callers are taking a
constant string hardcoded in the source as the first argument, and for
the rest we can probably assume they've done the NULL check
themselves. Even if they didn't, modern C libraries are forgiving
about it (e.g. glibc formatting it as "(null)"), on those that aren't,
well, we were about to die anyway. But let's include the check anyway
for good measure.

1. https://pubs.opengroup.org/onlinepubs/009604499/functions/setenv.html

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 13:15:00 -07:00
54b4d125d5 ls-files: use imperative mood for -X and -z option description
Usage description for -X and -z options use descriptive instead of
imperative mood. Edit it for consistency with other options.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 12:07:37 -07:00
7740ac691d rebase: dereference tags
A rebase started with 'git rebase <A> <B>' is conceptually to first
checkout <B> and run 'git rebase <A>' starting from that state.  'git
rebase --abort' in the middle of such a rebase should take us back to
the state we checked out <B>.

This used to work, even when <B> is a tag that points at a commit,
until Git 2.20.0 when the command was reimplemented in C.  The command
now complains that the tag object itself cannot be checked out, which
may be technically correct but is not what the user asked to do.

Fix this old regression by using lookup_commit_reference_by_name()
when parsing <B>. The scripted version did not need to peel the tag
because the commands it passed the tag to (e.g 'git reset') peeled the
tag themselves.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 12:04:52 -07:00
1d188263e0 rebase: use lookup_commit_reference_by_name()
peel_committish() appears to have been copied from the scripted rebase
but it duplicates the functionality of
lookup_commit_reference_by_name() so lets use that instead.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 12:04:52 -07:00
35f070b4de rebase: use our standard error return value
Git uses −1 to signal an error. The builtin rebase converts these to
+1 all over the place using !! (presumably because the in the scripted
version an error was signalled by +1). This is confusing and clutters
the code, we only need to convert the value when the function returns.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 12:04:51 -07:00
1e66871608 grep: store grep_source buffer as const
Our grep_buffer() function takes a non-const buffer, which is confusing:
we don't take ownership of nor write to the buffer.

This mostly comes from the fact that the underlying grep_source struct
in which we store the buffer uses non-const pointer. The memory pointed
to by the struct is sometimes owned by us (for FILE or OID sources), and
sometimes not (for BUF sources).

Let's store it as const, which lets us err on the side of caution (i.e.,
the compiler will warn us if any of our code writes to or tries to free
it).

As a result, we must annotate the one place where we do free it by
casting away the constness. But that's a small price to pay for the
extra safety and clarity elsewhere (and indeed, it already had a comment
explaining why GREP_SOURCE_BUF _didn't_ free it).

And then we can mark grep_buffer() as taking a const buffer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 11:59:50 -07:00
1a845fbc48 grep: mark "haystack" buffers as const
When we're grepping in a buffer, we don't need to modify it. So we can
take "const char *" buffers, rather than "char *". This can avoid some
awkward casts in our callers, and make our expectations more clear (we
will not take ownership of the memory, nor will we ever write to it).

These spots don't all necessarily have to be converted at the same time,
but some of them are dependent on others, because we pass
pointers-to-pointers in a few cases. And none of this should change any
behavior, since we're just adding "const" qualifiers (and likewise, the
compiler will let us know if we missed any spots). So it's relatively
low-risk to just do this all at once.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 11:59:50 -07:00
f84e79ff4b grep: stop modifying buffer in grep_source_1()
We find the end of each matching line of a buffer, and then temporarily
write a NUL to turn it into a regular C string. But we don't need to do
so, because the only thing we do in the interim is pass the line and its
length (via an "eol" pointer) to match_line(). And that function should
only look at the bytes we passed it, whether it has a terminating NUL or
not.

We can drop this temporary write in order to simplify the code and make
it easier to use const buffers in more of grep.c.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 11:59:50 -07:00
995e525b17 grep: stop modifying buffer in show_line()
When showing lines via grep (or looking for funcnames), we call
show_line() on a multi-line buffer. It finds the end of line and marks
it with a NUL. However, we don't need to do so, as the resulting line is
only used along with its "eol" marker:

 - we pass both to next_match(), which takes care to look at only the
   bytes we specified

 - we pass the line to output_color() without its matching eol marker.
   However, we do use the "match" struct we got from next_match() to
   tell it how many bytes to look at (which can never exceed the string
   we passed it).

So we can stop setting and restoring this NUL marker. That makes the
code simpler, and will allow us to take a const buffer in a future
patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 11:59:50 -07:00
cc8e26ee8d grep: stop modifying buffer in strip_timestamp
When grepping for headers in commit objects, we receive individual
lines (e.g., "author Name <email> 1234 -0000"), and then strip off the
timestamp to do our match. We do so by writing a NUL byte over the
whitespace separator, and then remembering to restore it later.

We had to do it this way when this was added back in a4d7d2c6db (log
--author/--committer: really match only with name part, 2008-09-04),
because we fed the result directly to regexec(), which expects a
NUL-terminated string. But since b7d36ffca0 (regex: use regexec_buf(),
2016-09-21), we have a function which can match part of a buffer.

So instead of modifying the string, we can instead just move the "eol"
pointer, and the rest of the code will do the right thing. This will let
further patches mark more buffers as "const".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 11:59:50 -07:00
0394f8d002 builtin/multi-pack-index.c: disable top-level --[no-]progress
In a similar spirit as the previous patch, let sub-commands which
support showing or hiding a progress meter handle parsing the
`--progress` or `--no-progress` option, but do not expose it as an
option to the top-level `multi-pack-index` builtin.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22 09:26:29 -07:00
99c99ed825 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 15:20:46 -07:00
71e36361bf Merge branch 'jk/t5562-racefix'
Test update.

* jk/t5562-racefix:
  t5562: use alarm() to interrupt timed child-wait
2021-09-20 15:20:46 -07:00
2a6d0b2b68 Merge branch 'rs/no-mode-to-open-when-appending'
The "mode" word is useless in a call to open(2) that does not
create a new file.  Such a call in the files backend of the ref
subsystem has been cleaned up.

* rs/no-mode-to-open-when-appending:
  refs/files-backend: remove unused open mode parameter
2021-09-20 15:20:45 -07:00
10a08cbd39 Merge branch 'rs/setup-use-xopen-and-xdup'
Code clean-up.

* rs/setup-use-xopen-and-xdup:
  setup: use xopen and xdup in sanitize_stdfds
2021-09-20 15:20:45 -07:00
c042ad5ad5 Merge branch 'js/run-command-close-packs'
The run-command API has been updated so that the callers can easily
ask the file descriptors open for packfiles to be closed immediately
before spawning commands that may trigger auto-gc.

* js/run-command-close-packs:
  Close object store closer to spawning child processes
  run_auto_maintenance(): implicitly close the object store
  run-command: offer to close the object store before running
  run-command: prettify the `RUN_COMMAND_*` flags
  pull: release packs before fetching
  commit-graph: when closing the graph, also release the slab
2021-09-20 15:20:45 -07:00
a16dd13740 Merge branch 'ds/mergies-with-sparse-index'
Various mergy operations have been prepared to work efficiently
with the sparse index.

* ds/mergies-with-sparse-index:
  sparse-index: integrate with cherry-pick and rebase
  sequencer: ensure full index if not ORT strategy
  t1092: add cherry-pick, rebase tests
  merge-ort: expand only for out-of-cone conflicts
  merge: make sparse-aware with ORT
  diff: ignore sparse paths in diffstat
2021-09-20 15:20:45 -07:00
dc89c34d9e Merge branch 'ds/sparse-index-ignored-files'
In cone mode, the sparse-index code path learned to remove ignored
files (like build artifacts) outside the sparse cone, allowing the
entire directory outside the sparse cone to be removed, which is
especially useful when the sparse patterns change.

* ds/sparse-index-ignored-files:
  sparse-checkout: clear tracked sparse dirs
  sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
  attr: be careful about sparse directories
  sparse-checkout: create helper methods
  sparse-index: use WRITE_TREE_MISSING_OK
  sparse-index: silently return when cache tree fails
  unpack-trees: fix nested sparse-dir search
  sparse-index: silently return when not using cone-mode patterns
  t7519: rewrite sparse index test
2021-09-20 15:20:44 -07:00
e78db9d303 Merge branch 'ar/submodule-run-update-procedure'
Reimplementation of parts of "git submodule" in C continues.

* ar/submodule-run-update-procedure:
  submodule--helper: run update procedures from C
2021-09-20 15:20:44 -07:00
1b8bd2243e Merge branch 'ab/make-tags-cleanup'
Build clean-up for "make tags" and friends.

* ab/make-tags-cleanup:
  Makefile: normalize clobbering & xargs for tags targets
  Makefile: remove "cscope.out", not "cscope*" in cscope.out target
  Makefile: don't use "FORCE" for tags targets
  Makefile: add QUIET_GEN to "cscope" target
  Makefile: move ".PHONY: cscope" near its target
2021-09-20 15:20:43 -07:00
5331af2352 Merge branch 'ab/serve-cleanup'
Code clean-up around "git serve".

* ab/serve-cleanup:
  upload-pack: document and rename --advertise-refs
  serve.[ch]: remove "serve_options", split up --advertise-refs code
  {upload,receive}-pack tests: add --advertise-refs tests
  serve.c: move version line to advertise_capabilities()
  serve: move transfer.advertiseSID check into session_id_advertise()
  serve.[ch]: don't pass "struct strvec *keys" to commands
  serve: use designated initializers
  transport: use designated initializers
  transport: rename "fetch" in transport_vtable to "fetch_refs"
  serve: mark has_capability() as static
2021-09-20 15:20:43 -07:00
bbeca063cf Merge branch 'ar/submodule-add-more'
More parts of "git submodule add" has been rewritten in C.

* ar/submodule-add-more:
  submodule--helper: rename compute_submodule_clone_url()
  submodule--helper: remove resolve-relative-url subcommand
  submodule--helper: remove add-config subcommand
  submodule--helper: remove add-clone subcommand
  submodule--helper: convert the bulk of cmd_add() to C
  dir: libify and export helper functions from clone.c
  submodule--helper: remove repeated code in sync_submodule()
  submodule--helper: refactor resolve_relative_url() helper
  submodule--helper: add options for compute_submodule_clone_url()
2021-09-20 15:20:43 -07:00
b5a36278f4 Merge branch 'ar/submodule-add-config'
Large part of "git submodule add" gets rewritten in C.

* ar/submodule-add-config:
  submodule--helper: introduce add-config subcommand
2021-09-20 15:20:42 -07:00
67fc02be54 Merge branch 'ab/unbundle-progress'
Add progress display to "git bundle unbundle".

* ab/unbundle-progress:
  bundle: show progress on "unbundle"
  index-pack: add --progress-title option
  bundle API: change "flags" to be "extra_index_pack_args"
  bundle API: start writing API documentation
2021-09-20 15:20:42 -07:00
a1af533323 Merge branch 'tb/pack-finalize-ordering'
The order in which various files that make up a single (conceptual)
packfile has been reevaluated and straightened up.  This matters in
correctness, as an incomplete set of files must not be shown to a
running Git.

* tb/pack-finalize-ordering:
  pack-objects: rename .idx files into place after .bitmap files
  pack-write: split up finish_tmp_packfile() function
  builtin/index-pack.c: move `.idx` files into place last
  index-pack: refactor renaming in final()
  builtin/repack.c: move `.idx` files into place last
  pack-write.c: rename `.idx` files after `*.rev`
  pack-write: refactor renaming in finish_tmp_packfile()
  bulk-checkin.c: store checksum directly
  pack.h: line-wrap the definition of finish_tmp_packfile()
2021-09-20 15:20:42 -07:00
403192acb6 Merge branch 'cb/pedantic-build-for-developers'
Update the build procedure to use the "-pedantic" build when
DEVELOPER makefile macro is in effect.

* cb/pedantic-build-for-developers:
  developer: enable pedantic by default
  win32: allow building with pedantic mode enabled
  gettext: remove optional non-standard parens in N_() definition
2021-09-20 15:20:41 -07:00
df0c308c1a Merge branch 'ab/progress-users-adjust-counters'
The code to show progress indicator in a few code paths did not
cover between 0-100%, which has been corrected.

* ab/progress-users-adjust-counters:
  entry: show finer-grained counter in "Filtering content" progress line
  commit-graph: fix bogus counter in "Scanning merged commits" progress line
2021-09-20 15:20:41 -07:00
75405e7270 Merge branch 'dt/submodule-diff-fixes'
"git diff --submodule=diff" showed failure from run_command() when
trying to run diff inside a submodule, when the user manually
removes the submodule directory.

* dt/submodule-diff-fixes:
  diff --submodule=diff: don't print failure message twice
  diff --submodule=diff: do not fail on ever-initialied deleted submodules
  t4060: remove unused variable
2021-09-20 15:20:41 -07:00
c2509c5407 Merge branch 'jv/pkt-line-batch'
Reduce number of write(2) system calls while sending the
ref advertisement.

* jv/pkt-line-batch:
  upload-pack: use stdio in send_ref callbacks
  pkt-line: add stdio packet write functions
2021-09-20 15:20:41 -07:00
ed8794ef7a Merge branch 'lh/systemd-timers'
"git maintenance" scheduler learned to use systemd timers as a
possible backend.

* lh/systemd-timers:
  maintenance: add support for systemd timers on Linux
  maintenance: `git maintenance run` learned `--scheduler=<scheduler>`
  cache.h: Introduce a generic "xdg_config_home_for(…)" function
2021-09-20 15:20:40 -07:00
76f5fdc203 Merge branch 'ab/tr2-leaks-and-fixes'
The tracing of process ancestry information has been enhanced.

* ab/tr2-leaks-and-fixes:
  tr2: log N parent process names on Linux
  tr2: do compiler enum check in trace2_collect_process_info()
  tr2: leave the parent list empty upon failure & don't leak memory
  tr2: stop leaking "thread_name" memory
  tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
  tr2: remove NEEDSWORK comment for "non-procfs" implementations
2021-09-20 15:20:40 -07:00
11e5d0a262 Merge branch 'jt/grep-wo-submodule-odb-as-alternate'
The code to make "git grep" recurse into submodules has been
updated to migrate away from the "add submodule's object store as
an alternate object store" mechanism (which is suboptimal).

* jt/grep-wo-submodule-odb-as-alternate:
  t7814: show lack of alternate ODB-adding
  submodule-config: pass repo upon blob config read
  grep: add repository to OID grep sources
  grep: allocate subrepos on heap
  grep: read submodule entry with explicit repo
  grep: typesafe versions of grep_source_init
  grep: use submodule-ODB-as-alternate lazy-addition
  submodule: lazily add submodule ODBs as alternates
2021-09-20 15:20:39 -07:00
0649303820 Merge branch 'tb/multi-pack-bitmaps'
The reachability bitmap file used to be generated only for a single
pack, but now we've learned to generate bitmaps for history that
span across multiple packfiles.

* tb/multi-pack-bitmaps: (29 commits)
  pack-bitmap: drop bitmap_index argument from try_partial_reuse()
  pack-bitmap: drop repository argument from prepare_midx_bitmap_git()
  p5326: perf tests for MIDX bitmaps
  p5310: extract full and partial bitmap tests
  midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  t7700: update to work with MIDX bitmap test knob
  t5319: don't write MIDX bitmaps in t5319
  t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
  t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
  t5326: test multi-pack bitmap behavior
  t/helper/test-read-midx.c: add --checksum mode
  t5310: move some tests to lib-bitmap.sh
  pack-bitmap: write multi-pack bitmaps
  pack-bitmap: read multi-pack bitmaps
  pack-bitmap.c: avoid redundant calls to try_partial_reuse
  pack-bitmap.c: introduce 'bitmap_is_preferred_refname()'
  pack-bitmap.c: introduce 'nth_bitmap_object_oid()'
  pack-bitmap.c: introduce 'bitmap_num_objects()'
  midx: avoid opening multiple MIDXs when writing
  midx: close linked MIDXs, avoid leaking memory
  ...
2021-09-20 15:20:39 -07:00
deec8aa2d0 Merge branch 'ps/fetch-optim'
Optimize code that handles large number of refs in the "git fetch"
code path.

* ps/fetch-optim:
  fetch: avoid second connectivity check if we already have all objects
  fetch: merge fetching and consuming refs
  fetch: refactor fetch refs to be more extendable
  fetch-pack: optimize loading of refs via commit graph
  connected: refactor iterator to return next object ID directly
  fetch: avoid unpacking headers in object existence check
  fetch: speed up lookup of want refs via commit-graph
2021-09-20 15:20:39 -07:00
6b58df54cf clone: handle unborn branch in bare repos
When cloning a repository with an unborn HEAD, we'll set the local HEAD
to match it only if the local repository is non-bare. This is
inconsistent with all other combinations:

  remote HEAD       | local repo | local HEAD
  -----------------------------------------------
  points to commit  | non-bare   | same as remote
  points to commit  | bare       | same as remote
  unborn            | non-bare   | same as remote
  unborn            | bare       | local default

So I don't think this is some clever or subtle behavior, but just a bug
in 4f37d45706 (clone: respect remote unborn HEAD, 2021-02-05). And it's
easy to see how we ended up there. Before that commit, the code to set
up the HEAD for an empty repo was guarded by "if (!option_bare)". That's
because the only thing it did was call install_branch_config(), and we
don't want to do so for a bare repository (unborn HEAD or not).

That commit put the handling of unborn HEADs into the same block, since
those also need to call install_branch_config(). But the unborn case has
an additional side effect of calling create_symref(), and we want that
to happen whether we are bare or not.

This patch just pulls all of the "figure out the default branch" code
out of the "!option_bare" block. Only the actual config installation is
kept there.

Note that this does mean we might allocate "ref" and not use it (if the
remote is empty but did not advertise an unborn HEAD). But that's not
really a big deal since this isn't a hot code path, and it keeps the
code simple. The alternative would be handling unborn_head_target
separately, but that gets confusing since its memory ownership is
tangled up with the "ref" variable.

There's just one new test, for the case we're fixing. The other ones in
the table are handled elsewhere (the unborn non-bare case just above,
and the actually-born cases in t5601, t5606, and t5609, as they do not
require v2's "unborn" protocol extension).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 14:05:36 -07:00
93a8ed28ea Merge branch 'ab/retire-option-argument' into da/difftool
* ab/retire-option-argument:
  parse-options API: remove OPTION_ARGUMENT feature
  difftool: use run_command() API in run_file_diff()
  difftool: prepare "diff" cmdline in cmd_difftool()
  difftool: prepare "struct child_process" in cmd_difftool()
2021-09-20 11:42:34 -07:00
3584cff71c merge-ort: fix completely wrong comment
Not sure what happened, but the comment is describing code elsewhere in
the file.  Fix the comment to actually discuss the code that follows.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 11:25:02 -07:00
b031f47802 trace2.h: fix trivial comment typo
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 11:25:01 -07:00
04d3761db2 Merge branch 'en/am-abort-fix' into en/removing-untracked-fixes
* en/am-abort-fix:
  am: fix incorrect exit status on am fail to abort
  t4151: add a few am --abort tests
  git-am.txt: clarify --abort behavior
2021-09-20 11:22:09 -07:00
d5fdf3073a builtin/commit-graph.c: don't accept common --[no-]progress
In 84e4484f12 (commit-graph: use parse_options_concat(), 2021-08-23) we
unified common options of commit-graph's subcommands into a single
"common_opts" array.

But 84e4484f12 introduced a behavior change which is to accept the
"--[no-]progress" option before any sub-commands, e.g.,

    git commit-graph --progress write ...

Prior to that commit, the above would error out with "unknown option".

There are two issues with this behavior change. First is that the
top-level --[no-]progress is not always respected. This is because
isatty(2) is performed in the sub-commands, which unconditionally
overwrites any --[no-]progress that was given at the top-level.

But the second issue is that the existing sub-commands of commit-graph
only happen to both have a sensible interpretation of what `--progress`
or `--no-progress` means. If we ever added a sub-command which didn't
have a notion of progress, we would be forced to ignore the top-level
`--[no-]progress` altogether.

Since we haven't released a version of Git that supports --[no-]progress
as a top-level option for `git commit-graph`, let's remove it.

Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 11:01:23 -07:00
d1e894c6d7 Document rebase.forkpoint in rebase man page
The configuration option `rebase.forkpoint' is only mentioned in the man
page of git-config(1). Since it is a configuration for rebase, mention
it in the documentation of rebase at the --fork-point/--no-fork-point
section. This will help users set a preferred default for their
workflow.

Signed-off-by: Wesley Schwengle <wesley@opperschaap.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 09:05:48 -07:00
05881a6fc9 t/helper/simple-ipc: convert test-simple-ipc to use start_bg_command
Convert test helper to use `start_bg_command()` when spawning a server
daemon in the background rather than blocks of platform-specific code.

Also, while here, remove _() translation around error messages since
this is a test helper and not Git code.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 08:57:58 -07:00
fdb1322651 run-command: create start_bg_command
Create a variation of `run_command()` and `start_command()` to launch a command
into the background and optionally wait for it to become "ready" before returning.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 08:57:58 -07:00
8750249053 simple-ipc/ipc-win32: add Windows ACL to named pipe
Set an ACL on the named pipe to allow the well-known group EVERYONE
to read and write to the IPC server's named pipe.  In the event that
the daemon was started with elevation, allow non-elevated clients
to communicate with the daemon.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 08:57:58 -07:00
9bd51d4975 simple-ipc/ipc-win32: add trace2 debugging
Create "ipc-debug" category events to log unexpected errors
when creating Simple-IPC connections.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 08:57:58 -07:00
59c923229e simple-ipc: move definition of ipc_active_state outside of ifdef
Move the declartion of the `enum ipc_active_state` type outside of
the SUPPORTS_SIMPLE_IPC ifdef.

A later commit will introduce the `fsmonitor_ipc__*()` API and stub in
a "mock" implementation that requires this enum in some function
signatures.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 08:57:58 -07:00
a3e2033e04 simple-ipc: preparations for supporting binary messages.
Add `command_len` argument to the Simple IPC API.

In my original Simple IPC API, I assumed that the request would always
be a null-terminated string of text characters.  The `command`
argument was just a `const char *`.

I found a caller that would like to pass a binary command to the
daemon, so I am amending the Simple IPC API to receive `const char
*command, size_t command_len` arguments.

I considered changing the `command` argument to be a `void *`, but the
IPC layer simply passes it to the pkt-line layer which takes a `const
char *`, so to avoid confusion I left it as is.

Note, the response side has always been a `struct strbuf` which
includes the buffer and length, so we already support returning a
binary answer.  (Yes, it feels a little weird returning a binary
buffer in a `strbuf`, but it works.)

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 08:57:58 -07:00
64bc75244b trace2: add trace2_child_ready() to report on background children
Create "child_ready" event to capture the state of a child process
created in the background.

When a child command is started a "child_start" event is generated in
the Trace2 log.  For normal synchronous children, a "child_exit" event
is later generated when the child exits or is terminated.  The two events
include information, such as the "child_id" and "pid", to allow post
analysis to match-up the command line and exit status.

When a child is started in the background (and may outlive the parent
process), it is not possible for the parent to emit a "child_exit"
event.  Create a new "child_ready" event to indicate whether the
child was successfully started.  Also include the "child_id" and "pid"
to allow similar post processing.

This will be used in a later commit with the new "start_bg_command()".

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 08:57:58 -07:00
187fc8b8b6 unicode: update the width tables to Unicode 14
Released[0] after a long beta period and including several additional
zero/double width characters.

[0] https://home.unicode.org/announcing-the-unicode-standard-version-14-0/

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-17 17:26:21 -07:00
54156af0d6 t5326: test propagating hashcache values
Now that we both can propagate values from the hashcache, and respect
the configuration to enable the hashcache at all, test that both of
these function correctly by hardening their behavior with a test.

Like the hash-cache in classic single-pack bitmaps, this helps more
proportionally the more up-to-date your bitmap coverage is. When our
bitmap coverage is out-of-date with the ref tips, we spend more time
proportionally traversing, and all of that traversal gets the name-hash
filled in.

But for the up-to-date bitmaps, this helps quite a bit. These numbers
are on git.git, with `pack.threads=1` to help see the difference
reflected in the overall runtime.

    Test                            origin/tb/multi-pack-bitmaps   HEAD
    -------------------------------------------------------------------------------------
    5326.4: simulated clone         1.87(1.80+0.07)                1.46(1.42+0.03) -21.9%
    5326.5: simulated fetch         2.66(2.61+0.04)                1.47(1.43+0.04) -44.7%
    5326.6: pack to file (bitmap)   2.74(2.62+0.12)                1.89(1.82+0.07) -31.0%

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-17 14:34:48 -07:00
bf4a60874a p5326: generate pack bitmaps before writing the MIDX bitmap
To help test the performance of permuting the contents of the hash-cache
when generating a MIDX bitmap, we need a bitmap which has its hash-cache
populated.

And since multi-pack bitmaps don't add *new* values to the hash-cache,
we have to rely on a single-pack bitmap to generate those values for us.

Therefore, generate a pack bitmap before the MIDX one in order to ensure
that the MIDX bitmap has entries in its hash-cache. Since we don't want
to time generating the pack bitmap, move that to a non-perf test run
before we try to generate the MIDX bitmap.

Likewise, get rid of the pack bitmap afterwords, to make certain that we
are not accidentally using it in the performance tests run later on.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-17 14:34:48 -07:00
4b81f690f6 Documentation: cleanup git-cvsserver
Fix a few typos and alignment issues, and while at it update the
example hashes to show most of the ones available in recent crypt(3).

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 20:47:48 -07:00
bffcb4d9d6 git-cvsserver: protect against NULL in crypt(3)
Some versions of crypt(3) will return NULL when passed an unsupported
hash type (ex: OpenBSD with DES), so check for undef instead of using
it directly.

Also use this to probe the system and select a better hash function in
the tests, so it can pass successfully.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
[jc: <CAPUEspjqD5zy8TLuFA96usU7FYi=0wF84y7NgOVFqegtxL9zbw@mail.gmail.com>]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 20:47:23 -07:00
a7775c7eb8 git-cvsserver: use crypt correctly to compare password hashes
c057bad370 (git-cvsserver: use a password file cvsserver pserver,
2010-05-15) adds a way for `git cvsserver` to provide authenticated
pserver accounts without having clear text passwords, but uses the
username instead of the password to the call for crypt(3).

Correct that, and make sure the documentation correctly indicates how
to obtain hashed passwords that could be used to populate this
configuration, as well as correcting the hash that was used for the
tests.

This change will require that any user of this feature updates the
hashes in their configuration, but has the advantage of using a more
similar format than cvs uses, probably also easying any migration.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 15:06:24 -07:00
66c0c44df6 t0000: avoid masking git exit value through pipes
9af0b8dbe2 (t0000-basic: more commit-tree tests., 2006-04-26) adds
tests for commit-tree that mask the return exit from git as described
in a378fee5b0 (Documentation: add shell guidelines, 2018-10-05).

Fix the tests, to avoid pipes by using a temporary file instead.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 13:43:42 -07:00
637799bf0a tree-diff: fix leak when not HAVE_ALLOCA_H
b8ba412bf7 (tree-diff: avoid alloca for large allocations, 2016-06-07)
adds a way to route some bigger allocations out of the stack and free
them through the addition of two conveniently named macros, but leaves
the calls to free the xalloca part, which could be also in the heap,
if the system doesn't HAVE_ALLOCA_H (ex: macOS and other BSD).

Add the missing free call, xalloca_free(), which is a noop if we
allocated memory in the stack frame, but a real free() if we
allocated in the heap instead, and while at it, change the expression
to match in both macros for ease of readability.

This avoids a leak reported by LSAN while running t0000 but that
wouldn't fail the test (which is fixed in the next patch):

  SUMMARY: LeakSanitizer: 1034 byte(s) leaked in 15 allocation(s).

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 13:43:42 -07:00
afb32e8101 pack-revindex.h: correct the time complexity descriptions
Time complexities for pack_pos_to_midx and midx_to_pack_pos are swapped,
correct it.

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 22:16:25 -07:00
4c719308ce The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 13:15:28 -07:00
44257f7b52 Merge branch 'jc/prefix-filename-allocates'
Leakfix.

* jc/prefix-filename-allocates:
  hash-object: prefix_filename() returns allocated memory these days
2021-09-15 13:15:28 -07:00
3d141d8789 Merge branch 'rs/range-diff-avoid-segfault-with-I'
"git range-diff -I... <range> <range>" segfaulted, which has been
corrected.

* rs/range-diff-avoid-segfault-with-I:
  range-diff: avoid segfault with -I
2021-09-15 13:15:27 -07:00
1ea5e46cb9 Merge branch 'ab/reverse-midx-optim'
The code that optionally creates the *.rev reverse index file has
been optimized to avoid needless computation when it is not writing
the file out.

* ab/reverse-midx-optim:
  pack-write: skip *.rev work when not writing *.rev
2021-09-15 13:15:27 -07:00
5639a8d144 Merge branch 'bs/install-strip'
"make INSTALL_STRIP=-s install" allows the installation step to use
"install -s" to strip the binaries as they get installed.

* bs/install-strip:
  make: add INSTALL_STRIP option variable
2021-09-15 13:15:26 -07:00
2b2af95908 Merge branch 'pb/test-use-user-env'
Teach "test_pause" and "debug" helpers to allow using the HOME and
TERM environment variables the user usually uses.

* pb/test-use-user-env:
  test-lib-functions: keep user's debugger config files and TERM in 'debug'
  test-lib-functions: optionally keep HOME, TERM and SHELL in 'test_pause'
  test-lib-functions: use 'TEST_SHELL_PATH' in 'test_pause'
2021-09-15 13:15:26 -07:00
c76fcf3e46 Merge branch 'jc/trivial-threeway-binary-merge'
The "git apply -3" code path learned not to bother the lower level
merge machinery when the three-way merge can be trivially resolved
without the content level merge.

* jc/trivial-threeway-binary-merge:
  apply: resolve trivial merge without hitting ll-merge with "--3way"
2021-09-15 13:15:26 -07:00
f696272e58 Merge branch 'bs/doc-bugreport-outdir'
Docfix.

* bs/doc-bugreport-outdir:
  Documentation: fix default directory of git bugreport -o
2021-09-15 13:15:25 -07:00
59a29d1644 Merge branch 'ab/no-more-check-bindir'
Build simplification.

* ab/no-more-check-bindir:
  Makefile: remove the check_bindir script
2021-09-15 13:15:25 -07:00
10de757a09 Merge branch 'ab/send-email-config-fix'
Regression fix.

* ab/send-email-config-fix:
  send-email: fix a "first config key wins" regression in v2.33.0
2021-09-15 13:15:24 -07:00
e8332242b7 Merge branch 'so/diff-index-regression-fix'
Recent "diff -m" changes broke "gitk", which has been corrected.

* so/diff-index-regression-fix:
  diff-index: restore -c/--cc options handling
2021-09-15 13:15:24 -07:00
7c1200745b t1400: avoid SIGPIPE race condition on fifo
t1400.190 sometimes fails or even hangs because of the way it uses
fifos. Our goal is to interactively read and write lines from
update-ref, so we have two fifos, in and out. We open a descriptor
connected to "in" and redirect output to that, so that update-ref does
not see EOF as it would if we opened and closed it for each "echo" call.

But we don't do the same for the output. This leads to a race where our
"read response <out" has not yet opened the fifo, but update-ref tries
to write to it and gets SIGPIPE. This can result in the test failing, or
worse, hanging as we wait forever for somebody to write to the pipe.

This is the same proble we fixed in 4783e7ea83 (t0008: avoid SIGPIPE
race condition on fifo, 2013-07-12), and we can fix it the same way, by
opening a second long-running descriptor.

Before this patch, running:

  ./t1400-update-ref.sh --run=1,190 --stress

failed or hung within a few dozen iterations. After it, I ran it for
several hundred without problems.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 13:06:50 -07:00
ce125d431a submodule: extract path to submodule gitdir func
We currently store each submodule gitdir in ".git/modules/<name>", but
this has problems with some submodule naming schemes, as described in a
comment in submodule_name_to_gitdir() in this patch.

Extract the determination of the location of a submodule's gitdir into
its own function submodule_name_to_gitdir(). For now, the problem
remains unsolved, but this puts us in a better position for finding a
solution.

This was motivated, at $DAYJOB, by a part of Android's repo hierarchy
[1]. In particular, there is a repo "build", and several repos of the
form "build/<name>".

This is based on earlier work by Brandon Williams [2].

[1] https://android.googlesource.com/platform/
[2] https://lore.kernel.org/git/20180808223323.79989-2-bmwill@google.com/

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:59:12 -07:00
ccf094788c ls-refs: reject unknown arguments
The v2 ls-refs command may receive extra arguments from the client, one
per pkt-line. The spec is pretty clear that the arguments must come from
a specified set, but we silently ignore any unknown entries. For a
well-behaved client this doesn't matter, but it makes testing and
debugging more confusing. Let's tighten this up to match the spec.

In theory this liberal behavior _could_ be useful for extending the
protocol. But:

  - every other part of the protocol requires that the server first
    indicate that it supports the argument; this includes the fetch and
    object-info commands, plus the "unborn" capability added to ls-refs
    itself

  - it's not a very good extension mechanism anyway; without the server
    advertising support, clients would have no idea if the argument was
    silently ignored, or accepted and simply had no effect

So we're not really losing anything by tightening this.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
0ab7eeccd9 serve: reject commands used as capabilities
Our table of v2 "capabilities" contains everything we might tell the
client we support. But there are differences in how we expect the client
to respond. Some of the entries are true capabilities (i.e., we expect
the client to say "yes, I support this"), and some are ones we expect
them to send as commands (with "command=ls-refs" or similar).

When we receive a capability used as a command, we complain about that.
But when we receive a command used as a capability (e.g., just "ls-refs"
in a pkt-line by itself), we silently ignore it.

This isn't really hurting anything (clients shouldn't send it, and we'll
ignore it), but we can tighten up the protocol to match what we expect
to happen.

There are two new tests here. The first one checks a capability used as
a command, which already passes. The second tests a command as a
capability, which this patch fixes.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
108c265f27 serve: reject bogus v2 "command=ls-refs=foo"
When we see a line from the client like "command=ls-refs", we parse
everything after the equals sign as a capability, which we check against
our capabilities table. If we don't recognize the command (e.g.,
"command=foo"), we'll reject it.

But in parse_command(), we use the same get_capability() parser for
parsing non-command lines. So if we see "command=ls-refs=foo", we will
feed "ls-refs=foo" to get_capability(), which will say "OK, that's
ls-refs, with value 'foo'". But then we simply ignore the value
entirely.

The client is violating the spec here, which says:

      command = PKT-LINE("command=" key LF)
      key = 1*(ALPHA | DIGIT | "-_")

I.e., the key is not even allowed to have an equals sign in it. Whereas
a real non-command capability does allow a value:

      capability = PKT-LINE(key[=value] LF)

So by reusing the same get_capability() parser, we are mixing up the
"key" and "capability" tokens. However, since that parser tells us
whether it saw an "=", we can still use it; we just need to reject any
input that produces a non-NULL value field.

The current behavior isn't really hurting anything (the client should
never send such a request, and if it does, we just ignore the "value"
part). But since it does violate the spec, let's tighten it up to
prevent any surprising behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
9db5fb4fb3 docs/protocol-v2: clarify some ls-refs ref-prefix details
We've never documented the fact that a client can provide multiple
ref-prefix capabilities. Let's describe the behavior.

We also never discussed the "best effort" nature of the prefixes. The
client side of git.git has always treated them this way, filtering the
result with local patterns. And indeed any client must do this, because
the prefix patterns are not sufficient to express the usual refspecs
(and so for "foo" we ask for "refs/heads/foo", "refs/tags/foo", and so
on).

So this may be considered a change in the spec with respect to client
expectations / requirements, but it's mostly codifying existing
behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
7f0e4f6ac2 ls-refs: ignore very long ref-prefix counts
Because each "ref-prefix" capability from the client comes in its own
pkt-line, there's no limit to the number of them that a misbehaving
client may send. We read them all into a strvec, which means the client
can waste arbitrary amounts of our memory by just sending us "ref-prefix
foo" over and over.

One possible solution is to just drop the connection when the limit is
reached. If we set it high enough, then only misbehaving or malicious
clients would hit it. But "high enough" is vague, and it's unfriendly if
we guess wrong and a legitimate client hits this.

But we can do better. Since supporting the ref-prefix capability is
optional anyway, the client has to further cull the response based on
their own patterns. So we can simply ignore the patterns once we cross a
certain threshold. Note that we have to ignore _all_ patterns, not just
the ones past our limit (since otherwise we'd send too little data).

The limit here is fairly arbitrary, and probably much higher than anyone
would need in practice. It might be worth limiting it further, if only
because we check it linearly (so with "m" local refs and "n" patterns,
we do "m * n" string comparisons). But if we care about optimizing this,
an even better solution may be a more advanced data structure anyway.

I didn't bother making the limit configurable, since it's so high and
since Git should behave correctly in either case. It wouldn't be too
hard to do, but it makes both the code and documentation more complex.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
f0a35c9ce5 serve: drop "keys" strvec
We collect the set of capabilities the client sends us in a strvec.
While this is usually small, there's no limit to the number of
capabilities the client can send us (e.g., they could just send us
"agent" pkt-lines over and over, and we'd keep adding them to the list).

Since all code has been converted away from using this list, let's get
rid of it. This avoids a potential attack where clients waste our
memory.

Note that we do have to replace it with a flag, because some of the
flush-packet logic checks whether we've seen any valid commands or keys.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
ab539c9094 serve: provide "receive" function for session-id capability
Rather than pulling the session-id string from the list of collected
capabilities, we can handle it as soon as we receive it. This gets us
closer to dropping the collected list entirely.

The behavior should be the same, with one exception. Previously if the
client sent us multiple session-id lines, we'd report only the first.
Now we'll pass each one along to trace2. This shouldn't matter in
practice, since clients shouldn't do that (and if they do, it's probably
sensible to log them all).

As this removes the last caller of the static has_capability(), we can
remove it, as well (and in fact we must to avoid -Wunused-function
complaining).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 17:12:05 -07:00
c7d3aabd27 serve: provide "receive" function for object-format capability
We get any "object-format" specified by the client by searching for it
in the collected list of capabilities the client sent. We can instead
just handle it as soon as they send it. This is slightly more efficient,
and gets us one step closer to dropping that collected list.

Note that we do still have to do our final hash check after receiving
all capabilities (because they might not have sent an object-format line
at all, and we still have to check that the default matches our
repository algorithm). Since the check_algorithm() function would now be
down to a single if() statement, I've just inlined it in its only
caller.

There should be no change of behavior here, except for two
broken-protocol cases:

  - if the client sends multiple conflicting object-format capabilities
    (which they should not), we'll now choose the last one rather than
    the first. We could also detect and complain about the duplicates
    quite easily now, which we could not before, but I didn't do so
    here.

  - if the client sends a bogus "object-format" with no equals sign,
    we'll now say so, rather than "unknown object format: ''"

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 17:12:05 -07:00
97b89c8150 p5326: don't set core.multiPackIndex unnecessarily
When this performance test was originally written, `core.multiPackIndex`
was not the default and thus had to be enabled. But now that we have
18e449f86b (midx: enable core.multiPackIndex by default, 2020-09-25), we
no longer need this.

Drop the unnecessary setup (even though it's not hurting anything, it is
unnecessary at best and confusing at worst).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 16:34:18 -07:00
2082224f17 p5326: create missing 'perf-tag' tag
Some of the tests in test_full_bitmap rely on having a tag named
perf-tag in place. We could create it in test_full_bitmap(), but we want
to have it in place before the repack starts.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 16:34:18 -07:00
caca3c9f07 midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps
In the previous commit, the bitmap writing code learned to propagate
values from an existing hash-cache extension into the bitmap that it is
writing.

Now that that functionality exists, let's expose it by teaching the 'git
multi-pack-index' builtin to respect the `pack.writeBitmapHashCache`
option so that the hash-cache may be written at all.

Two minor points worth noting here:

  - The 'git multi-pack-index write' sub-command didn't previously read
    any configuration (instead this is handled in the base command). A
    separate handler is added here to respect this write-specific
    config option.

  - I briefly considered adding a 'bitmap_flags' field to the static
    options struct, but decided against it since it would require
    plumbing through a new parameter to the write_midx_file() function.

    Instead, a new MIDX-specific flag is added, which is translated to
    the corresponding bitmap one.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 16:34:18 -07:00
8de300e1f7 pack-bitmap.c: propagate namehash values from existing bitmaps
When an old bitmap exists while writing a new one, we load it and build
a "reposition" table which maps bit positions of objects from the old
bitmap to their respective positions in the new bitmap. This can help
when we encounter a commit which was selected in both the old and new
bitmap, since we only need to permute its bit (not recompute it from
scratch).

We do not, however, repurpose existing namehash values in the case of
the hash-cache extension. There has been thus far no good reason to do
so, since all of the namehash values for objects in the new bitmap would
be populated during the traversal that was just performed by
pack-objects when generating single-pack reachability bitmaps.

But this isn't the case for multi-pack bitmaps, which are written via
`git multi-pack-index write --bitmap` and do not perform any traversal.
In this case all namehash values are set to zero, but we don't even
bother to check the `pack.writeBitmapHashcache` option anyway, so it
fails to matter.

There are two approaches we could take to fill in non-zero hash-cache
values:

  - have either the multi-pack-index builtin run its own
    traversal to attempt to fill in some values, or let a hypothetical
    caller (like `pack-objects` when `repack` eventually drives the
    `multi-pack-index` builtin) fill in the values they found during
    their traversal

  - or copy any existing namehash values that were stored in an
    existing bitmap to their corresponding positions in the new bitmap

In a system where a repository is generally repacked with `git repack
--geometric=<d>` and occasionally repacked with `git repack -a`, the
hash-cache coverage will tend towards all objects.

Since populating the hash-cache is additive (i.e., doing so only helps
our delta search), any intermediate lack of full coverage is just fine.
So let's start by just propagating any values from the existing
hash-cache if we see one.

The next patch will respect the `pack.writeBitmapHashcache` option while
writing MIDX bitmaps, and then test this new behavior.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 16:34:17 -07:00
a05f02b1d9 t/helper/test-bitmap.c: add 'dump-hashes' mode
The pack-bitmap writer code is about to learn how to propagate values
from an existing hash-cache. To prepare, teach the test-bitmap helper to
dump the values from a bitmap's hash-cache extension in order to test
those changes.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 16:34:17 -07:00
e56e53067f serve: add "receive" method for v2 capabilities table
We have a capabilities table that tells us what we should tell the
client we are capable of, and what to do when a client gives us a
particular command (e.g., "command=ls-refs"). But it doesn't tell us
what to do when the client sends us back a capability (e.g.,
"object-format=sha256"). We just collect them all in a strvec and hope
somebody can use them later.

Instead, let's provide a function pointer in the table to act on these.
This will eventually help us avoid collecting the strings, which will be
more efficient and less prone to mischief.

Using the new method is optional, which helps in two ways:

  - we can move existing capabilities over to this new system gradually
    in individual commits

  - some capabilities we don't actually do anything with anyway. For
    example, the client is free to say "agent=git/1.2.3" to us, but we
    do not act on the information in any way.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 10:56:19 -07:00
5ef260d2d1 serve: return capability "value" from get_capability()
When the client sends v2 capabilities, they may be simple, like:

  foo

or have a value like:

  foo=bar

(all of the current capabilities actually expect a value, but the
protocol allows for boolean ones).

We use get_capability() to make sure the client's pktline matches a
capability. In doing so, we parse enough to see the "=" and the value
(if any), but we immediately forget it. Nobody cares for now, because they end
up parsing the values out later using has_capability(). But in
preparation for changing that, let's pass back a pointer so the callers
know what we found.

Note that unlike has_capability(), we'll return NULL for a "simple"
capability. Distinguishing these will be useful for some future patches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 10:56:19 -07:00
76804526f9 serve: rename is_command() to parse_command()
The is_command() function not only tells us whether the pktline is a
valid command string, but it also parses out the command (and complains
if we see a duplicate). Let's rename it to make those extra functions
a bit more obvious.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 10:56:19 -07:00
0057847208 Merge branch 'ab/serve-cleanup' into jk/reduce-malloc-in-v2-servers
* ab/serve-cleanup:
  upload-pack: document and rename --advertise-refs
  serve.[ch]: remove "serve_options", split up --advertise-refs code
  {upload,receive}-pack tests: add --advertise-refs tests
  serve.c: move version line to advertise_capabilities()
  serve: move transfer.advertiseSID check into session_id_advertise()
  serve.[ch]: don't pass "struct strvec *keys" to commands
  serve: use designated initializers
  transport: use designated initializers
  transport: rename "fetch" in transport_vtable to "fetch_refs"
  serve: mark has_capability() as static
2021-09-14 10:56:05 -07:00
b6d8887d3d documentation: add documentation for 'git version'
While 'git version' is probably the least complex git command,
it is a non-experimental user-facing builtin command. As such
it should have a help page.

Both `git help` and `git version` can be called as options
(`--help`/`--version`) that internally get converted to the
corresponding command. Add a small paragraph to
Documentation/git.txt describing how these two options
interact with each other and link to this help page for the
sub-options that `--version` can take. Well, currently there
is only one sub-option, but that could potentially increase
in future versions of Git.

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 10:05:40 -07:00
a3952f8e7c help: make sure local html page exists before calling external processes
We check that git.html exists, regardless of the page the user wants to open.
Checking whether the requested page exists instead gives us a smoother user
experience in two use cases:

1) The requested page doesn't exist

When calling a git command and there is an error, most users reasonably expect
git to produce an error message on the standard error stream, but in this case
we pass the filepath to git web--browse which passes it on to a browser (or a
helper program like xdg-open or start that should in turn open a browser)
without any error and many GUI based browsers or helpers won't output such a
message onto the standard error stream.

Especially the helper programs tend to show the corresponding error message in
a message box and wait for user input before exiting. This leaves users in
interactive console sessions without an error message in their console,
without a console prompt and without the help page they expected.

2) git.html is missing for some reason, but the user asked for some other page

We currently refuse to show any local html help page when we can't find
git.html. Even if the requested help page exists. If we check for the requested
page instead, we can show the user all available pages and only error out on
those that don't exist.

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 10:04:08 -07:00
bb390b1f49 git-compat-util: include declaration for unix sockets in windows
Available since Windows 10 release 1803 and Windows Server 2019.

NO_UNIX_SOCKETS is still the default for Windows builds, as they need
to keep backward compatibility with releases up to Windows 7, but allow
including the header otherwise.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 09:30:54 -07:00
245670cd46 credential-cache: check for windows specific errors
Connect and reset errors aren't what will be expected by POSIX but
are instead compatible with the ones used by WinSock.

To avoid any possibility of confusion with other systems, checks
for disconnection and availability had been abstracted into helper
functions that are platform specific.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 09:30:54 -07:00
0fdcfa2f9f t0301: fixes for windows compatibility
In preparation for a future patch that will allow building with
Unix Sockets in Windows, workaround a couple of issues from the
Mingw-W64 compatibility layer.

test -S is not able to detect that a file is a socket, so use
test -e instead (through a library function).

`mkdir -m` can't represent a valid ACL directly and fails with
permission problems, so instead call mkdir followed by chmod, which
has been enhanced to do so.

The last invocation of mkdir would likely need the same treatment
but SYMLINK is unlikely to be enabled on Windows so it has been
punted for now.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 09:30:53 -07:00
ae578de926 doc: config, tell readers of git help --config
The `git help` command gained the ability to list config variables in
3ac68a93fd (help: add --config to list all available config, 2018-05-26)
but failed to tell readers of the config documenation itself.

Provide that cross reference.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 14:51:07 -07:00
911aba1420 bisect--helper: retire --bisect-next-check subcommand
After reimplementation of `git bisect run` in C,
`--bisect-next-check` subcommand is not needed anymore.

Let's remove it from options list and code.

Mentored by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 13:37:39 -07:00
d1bbbe45df bisect--helper: reimplement bisect_run shell function in C
Reimplement the `bisect_run()` shell function
in C and also add `--bisect-run` subcommand to
`git bisect--helper` to call it from git-bisect.sh.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 13:37:37 -07:00
5e1f28d206 bisect--helper: reimplement bisect_visualize() shell function in C
Reimplement the `bisect_visualize()` shell function
in C and also add `--bisect-visualize` subcommand to
`git bisect--helper` to call it from git-bisect.sh.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 13:37:37 -07:00
3f36e6f30c run-command: make exists_in_PATH() non-static
Remove the `static` keyword from `exists_in_PATH()` function
and declare the function in `run-command.h` file.
The function will be used in bisect_visualize() in a later
commit.

Mentored by: Christian Couder <chriscool@tuxfamily.org>
Mentored by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 13:37:37 -07:00
5fe973b912 t6030-bisect-porcelain: add test for bisect visualize
Add a test to control breakages in bisect visualize command.

Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 13:37:37 -07:00
282073cce2 t6030-bisect-porcelain: add tests to control bisect run exit cases
There is a gap on bisect run test coverage related with error exits.
Add two tests to control these error cases.

Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 13:37:37 -07:00
d045719ac8 t3407: rework rebase --quit tests
9512177b68 ("rebase: add --quit to cleanup rebase, leave everything
else untouched", 2016-11-12) seems to have copied the --abort tests
but added two separate tests for the two rebase backends rather than
adding a single test into the existing testrebase() function.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 12:47:07 -07:00
1e14bc11ed t3407: strengthen rebase --abort tests
The existing tests only check that HEAD points to the correct
commit after aborting, they do not check that the original branch
is checked out.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 12:47:06 -07:00
0b7ae738a6 t3407: use test_path_is_missing
At the end of the test we expect the state directory to be missing,
but the tests only check it is not a directory.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 12:47:06 -07:00
54627db03a t3407: rename a variable
$dotest holds the name of the rebase state directory and was named
after the state directory used at the time the test was added. As we
no longer use that name rename the variable to reflect its purpose.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 12:47:06 -07:00
7390c65df4 t3407: use test_cmp_rev
This provides better diagnostics if a test fails

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 12:47:06 -07:00
e505f452d4 t3407: use test_commit
Simplify the setup by using test_commit. Note that this replaces the
branch "pre-rebase" with a tag of the same name. "pre-rebase" is only
used as a fixed reference point so it makes sense to use a tag rather
than a branch.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 12:47:06 -07:00
f20c1fb296 t3407: run tests in $TEST_DIRECTORY
Commit 97b88dd58c ("git-rebase.sh: Fix --merge --abort failures when
path contains whitespace", 2008-05-04) started running these tests in
a subdirectory with a space in its name. At that time $TEST_DIRECTORY
did not contain a space but shortly after in 4a7aaccd83 ("Rename the
test trash directory to contain spaces.", 2008-05-04) $TEST_DIRECTORY
was changed to contain a space so we no longer need to run these tests
in a subdirectory.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 12:47:06 -07:00
32da6e6daf http: don't hardcode the value of CURL_SOCKOPT_OK
Use the new git-curl-compat.h header to define CURL_SOCKOPT_OK to its
known value if we're on an older curl version that doesn't have it. It
was hardcoded in http.c in a15d069a19 (http: enable keepalive on TCP
sockets, 2013-10-12).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 10:39:04 -07:00
e4ff3b67c2 http: centralize the accounting of libcurl dependencies
As discussed in 644de29e22 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.

However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.

So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.

More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.

By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea (Merge branch 'ab/http-drop-old-curl', 2021-08-24).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 10:39:04 -07:00
905a028804 http: correct curl version check for CURLOPT_PINNEDPUBLICKEY
In aeff8a6121 (http: implement public key pinning, 2016-02-15) a
dependency and warning() was added if curl older than 7.44.0 was used,
but the relevant code depended on CURLOPT_PINNEDPUBLICKEY, introduced
in 7.39.0.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 10:39:04 -07:00
2a7f64616a http: correct version check for CURL_HTTP_VERSION_2
In d73019feb4 (http: add support selecting http version, 2018-11-08)
a dependency was added on CURL_HTTP_VERSION_2, but this feature was
introduced in curl version 7.43.0, not 7.47.0, as the incorrect
version check led us to believe.

As looking through the history of that commit on the mailing list will
reveal[1], the reason for this is that an earlier version of it
depended on CURL_HTTP_VERSION_2TLS, which was introduced in libcurl
7.47.0.

But the version that made it in in d73019feb4 had dropped the
dependency on CURL_HTTP_VERSION_2TLS, but the corresponding version
check was not corrected.

The newest symbol we depend on is CURL_HTTP_VERSION_2. It was added in
7.33.0, but the CURL_HTTP_VERSION_2 alias we used was added in
7.47.0. So we could support an even older version here, but let's just
correct the checked version.

1. https://lore.kernel.org/git/pull.69.git.gitgitgadget@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 10:39:04 -07:00
7ce3dcd533 http: drop support for curl < 7.18.0 (again)
In 644de29e22 (http: drop support for curl < 7.19.4, 2021-07-30) we
dropped support for curl < 7.19.4, so we can drop support for this
non-obvious dependency on curl < 7.18.0.

It's non-obvious because in curl's hex version notation 0x071800 is
version 7.24.0, *not* 7.18.0, so at a glance this patch looks
incorrect.

But it's correct, because the existing version check being removed
here is wrong. The check guards use of the following curl defines:

    CURLPROXY_SOCKS4                7.10
    CURLPROXY_SOCKS4A               7.18.0
    CURLPROXY_SOCKS5                7.10
    CURLPROXY_SOCKS5_HOSTNAME       7.18.0

I.e. the oldest version that has these is in fact 7.18.0, not
7.24.0. That we were checking 7.24.0 is just an mistake in
6d7afe07f2 (remote-http(s): support SOCKS proxies, 2015-10-26),
i.e. its author confusing base 10 and base 16.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 10:39:04 -07:00
2d4032c2fb Makefile: drop support for curl < 7.9.8 (again)
In 1119a15b5c (http: drop support for curl < 7.11.1, 2021-07-30)
support for curl versions older than 7.11.1 was removed, and we
currently require at least version 7.19.4, see 644de29e22 (http: drop
support for curl < 7.19.4, 2021-07-30).

In those changes this Makefile-specific check added in
0890098780 (Decide whether to build http-push in the Makefile,
2005-11-18) was missed, now that we're never going to use such an
ancient curl version we don't need to check that we have at least
7.9.8 here. I have no idea what in http-push.c broke on versions older
than that.

This does not impact "NO_CURL" setups, as this is in the "else" branch
after that check.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 10:39:04 -07:00
59a399ed36 INSTALL: mention that we need libcurl 7.19.4 or newer to build
Without NO_CURL=Y we require at least version "7.19.4" of libcurl, see
644de29e22 (http: drop support for curl < 7.19.4, 2021-07-30). Let's
document this in the "INSTALL" document.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 10:39:03 -07:00
4c25356e0e parse-options API: remove OPTION_ARGUMENT feature
As was noted in 1a85b49b87 (parse-options: make OPT_ARGUMENT() more
useful, 2019-03-14) there's only ever been one user of the
OPT_ARGUMENT(), that user was added in 20de316e33 (difftool: allow
running outside Git worktrees with --no-index, 2019-03-14).

The OPT_ARGUMENT() feature itself was added way back in
580d5bffde (parse-options: new option type to treat an option-like
parameter as an argument., 2008-03-02), but as discussed in
1a85b49b87 wasn't used until 20de316e33 in 2019.

Now that the preceding commit has migrated this code over to using
"struct strvec" to manage the "args" member of a "struct
child_process", we can just use that directly instead of relying on
OPT_ARGUMENT.

This has a minor change in behavior in that if we'll pass --no-index
we'll now always pass it as the first argument, before we'd pass it in
whatever position the caller did. Preserving this was the real value
of OPT_ARGUMENT(), but as it turns out we didn't need that either. We
can always inject it as the first argument, the other end will parse
it just the same.

Note that we cannot remove the "out" and "cpidx" members of "struct
parse_opt_ctx_t" added in 580d5bffde, while they were introduced with
OPT_ARGUMENT() we since used them for other things.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 23:27:38 -07:00
cc5b594788 difftool: use run_command() API in run_file_diff()
Change the run_file_diff() function to use the run_command() API
directly, instead of invoking the run_command_v_opt_cd_env() wrapper.

This allows it, like run_dir_diff(), to use the "args" from "struct
strvec", instead of the "const char **argv" passed into
cmd_difftool(). This will be used in the subsequent commit to get rid
of OPT_ARGUMENT() from cmd_difftool().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 23:27:38 -07:00
b4c7aab7b9 difftool: prepare "diff" cmdline in cmd_difftool()
We call into either run_dir_diff() or run_file_diff(), each of which
sets up a child argv starting with "diff" and some hard-coded options
(depending on which mode we're using). Let's extract that logic into the
caller, which will make it easier to modify the options for cases which
affect both functions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 23:27:38 -07:00
ec3cc27ab0 difftool: prepare "struct child_process" in cmd_difftool()
Move the preparation of the "struct child_process" from run_dir_diff()
to its only caller, cmd_difftool(). This is in preparation for
migrating run_file_diff() to using the run_command() API directly, and
to move more of the shared setup of the two to cmd_difftool().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 23:27:38 -07:00
84122ecdbf git rev-parse --parseopt tests: add more usagestr tests
Add tests for the "usagestr" passed to parse-options.c
usage_with_options_internal() through cmd_parseopt().

These test for edge cases in the existing behavior related to the
"--parseopt" interface doing its own line-splitting with
strbuf_getline(), and the native C interface expecting and potentially
needing to handle newlines within the strings in the array it
accepts. The results are probably something that wasn't anticipated,
but let's make sure we stay backwards compatible with it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 18:57:30 -07:00
78a509190d send-pack: properly use parse_options() API for usage string
When "send-pack" was changed to use the parse_options() API in
068c77a518 (builtin/send-pack.c: use parse_options API, 2015-08-19)
it was made to use one very long line, instead it should split them up
with newlines.

Furthermore we were including an inline explanation that you couldn't
combine "--all" and "<ref>", but unlike in the "blame" case this was
not preceded by an empty string.

Let's instead show that --all and <ref> can't be combined in the the
usual language of the usage syntax instead. We can make it clear that
one of the two options "--foo" and "--bar" is mandatory, but that the
two are mutually exclusive by referring to them as "( --foo | --bar
)".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 18:57:30 -07:00
5d70198efe parse-options API users: align usage output in C-strings
In preparation for having continued usage lines properly aligned in
"git <cmd> -h" output, let's have the "[" on the second such lines
align with the "[" on the first line.

In some cases this makes the output worse, because e.g. the "git
ls-remote -h" output had been aligned to account for the extra
whitespace that the usage_with_options_internal() function in
parse-options.c would add.

In other cases such as builtin/stash.c (not changed here), we were
aligned in the C strings, but since that didn't account for the extra
padding in usage_with_options_internal() it would come out looking
misaligned, e.g. code like this:

	N_("git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]\n"
	   "          [-u|--include-untracked] [-a|--all] [-m|--message <message>]\n"

Would emit:

   or: git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
          [-u|--include-untracked] [-a|--all] [-m|--message <message>]

Let's change all the usage arrays which use such continued usage
output via "\n"-embedding to be like builtin/stash.c.

This makes the output worse temporarily, but in a subsequent change
I'll improve the usage_with_options_internal() to take this into
account, at which point all of the strings being changed here will
emit prettier output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 18:57:30 -07:00
3218cb753f gc: remove unused launchctl_get_uid() call
When the launchctl_boot_plist() function was added in
a16eb6b1ff (maintenance: skip bootout/bootstrap when plist is
registered, 2021-08-24), an unused call to launchctl_get_uid() was
added along with it. That call appears to have been copy/pasted from
launchctl_boot_plist().

Since we can remove that, we can also get rid of the "result"
variable, whose only purpose was allow for the free() between its
assignment and the return. That pattern also appears to have been
copy/pasted from launchctl_boot_plist().

As the patch shows the returned value from launchctl_get_uid() wasn't
used at all in this function. The launchctl_get_uid() function itself
just calls xstrfmt() and getuid(), neither of which have any subtle
global side-effects, so this removal is safe.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:47:18 -07:00
c90be786da test-tool run-command: fix flip-flop init pattern
In be5d88e112 (test-tool run-command: learn to run (parts of) the
testsuite, 2019-10-04) an init pattern was added that would use
TESTSUITE_INIT, but then promptly memset() everything back to 0. We'd
then set the "dup" on the two string lists.

Our setting of "next" to "-1" thus did nothing, we'd reset it to "0"
before using it. Let's set it to "0" instead, and trust the
"STRING_LIST_INIT_DUP" to set "strdup_strings" appropriately for us.

Note that while we compile this code, there's no in-tree user for the
"testsuite" target being modified here anymore, see the discussion at
and around <nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet>[1].

1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:46:54 -07:00
f222bd34ff tests: remove leftover untracked files
Remove untracked files that are unwanted after they are done being used.

While the set of cases in this patch is certainly far from
comprehensive, it was motivated by some work to see what the fallout
would be if we were to make the removal of untracked files as a side
effect of other commands into an error.  Some cases were a bit more
involved than the testcase changes included in this patch, but the ones
included here represent the simple cases.  While this patch is not that
important since we are not changing the behavior of those other commands
into an error in the near term, I thought these changes were useful
anyway as an explicit documentation of the intent that these untracked
files are no longer useful.

Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:42:40 -07:00
8d133a4653 strvec: use size_t to store nr and alloc
We converted argv_array (which later became strvec) to use size_t in
819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in
order to avoid the possibility of integer overflow. But later, commit
d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally
converted these back to ints!

Those two commits were part of the same patch series. I'm pretty sure
what happened is that they were originally written in the opposite order
and then cleaned up and re-ordered during an interactive rebase. And
when resolving the inevitable conflict, I mistakenly took the "rename"
patch completely, accidentally dropping the type change.

We can correct it now; better late than never.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:25:23 -07:00
8f0f110156 compression: drop write-only core_compression_* variables
Since 8de7eeb54b (compression: unify pack.compression configuration
parsing, 2016-11-15) the variables core_compression_level and
core_compression_seen are only set, but never read.  Remove them.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:23:28 -07:00
7050791126 test-lib: remove unused $_x40 and $_z40 variables
These two have fallen out of use with the SHA-256 migration.

The last use of $_x40 was removed in fc7e73d7ef (t4013: improve
diff-post-processor logic, 2020-08-21) and

The last use of $_z40 was removed in 7a868c51c2 (t5562: use $ZERO_OID,
2019-12-21), but it was then needlessly refactored to be hash-agnostic
in 192b517589 (t: use hash-specific lookup tables to define test
constants, 2020-02-22). We can just remove it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:22:42 -07:00
296339549a git-bisect: remove unused SHA-1 $x40 shell variable
This variable was last used in code removed in
06f5608c14 (bisect--helper: `bisect_start` shell function partially in
C, 2019-01-02).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:22:35 -07:00
be8d370e3c git-sh-setup: remove unused "pull with rebase" message
Remove the "pull with rebase" message previously used by the
git-pull.sh script, which was removed in 49eb8d39c7 (Remove
contrib/examples/*, 2018-03-25).

Even if some out-of-tree user copy/pasted the old git-pull.sh code,
and relied on passing it a "pull with rebase" argument, we'll fall
back on the "*" case here, they just won't get the "pull with rebase"
part of their message translated.

I don't think it's likely that anyone out-of-tree relied on that, but
I'm being conservative here per the discussion that can be found
upthread of [1].

1. https://lore.kernel.org/git/87tuiwjfvi.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:22:05 -07:00
162410f8a0 git-submodule: remove unused is_zero_oid() function
The is_zero_oid() function in git-submodule.sh has not been used since
e83e3333b5 (submodule: port submodule subcommand 'summary' from shell
to C, 2020-08-13), so we can remove it.

This was the last user of the sane_egrep() function in
git-sh-setup.sh. I'm not removing it in case some out-of-tree user
relied on it. Per the discussion that can be found upthread of [1].

1. https://lore.kernel.org/git/87tuiwjfvi.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:21:51 -07:00
09ef66179b packfile: use oidset for bad objects
Store the object ID of broken pack entries in an oidset instead of
keeping only their hashes in an unsorted array.  The resulting code is
shorter and easier to read.  It also handles the (hopefully) very rare
case of having a high number of bad objects better.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:14:32 -07:00
7407d733a4 packfile: convert has_packed_and_bad() to object_id
The single caller has a full object ID, so pass it on instead of just
its hash member.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:14:32 -07:00
751530de5d packfile: convert mark_bad_packed_object() to object_id
All callers have full object IDs, so pass them on instead of just their
hash member.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:14:32 -07:00
893b563505 midx: inline nth_midxed_pack_entry()
fill_midx_entry() finds the position of an object ID and passes it to
nth_midxed_pack_entry(), which uses the position to look up the object
ID for its own purposes.  Inline the latter into the former to avoid
that lookup.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:14:32 -07:00
325006f2db oidset: make oidset_size() an inline function
oidset_size() just reads a single word from memory and returns it.
Avoid the function call overhead for this trivial operation by turning
it into an inline function.

While we're at it, declare its parameter const to allow it to be used
on read-only oidsets.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:14:32 -07:00
e54e50201c INSTALL: reword and copy-edit the "libcurl" section
Make the "libcurl" section shorter and more to the point, this is
mostly based on suggestions from [1].

1. https://lore.kernel.org/git/YTtxcBdF2VQdWp5C@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 12:00:15 -07:00
5b952447cc INSTALL: don't mention the "curl" executable at all
In 1d53f90ed9 (The "curl" executable is no longer required,
2008-06-15) the wording for requiring curl(1) was changed to the
current "you might also want...".

Mentioning the "curl" executable at all is just confusing, someone
building git might want to use it to debug things, but they might also
just use wget(1) or some other http client. The "curl" executable has
the advantage that you might be able to e.g. reproduce a bug in git's
usage of libcurl with it, but anyone going to those extents is
unlikely to be aided by this note in INSTALL.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 12:00:14 -07:00
b40845293b help: correct the usage string in -h and documentation
Clarify the usage string in the documentation so we group e.g. -i and
--info, and add the missing short options to the "-h" output.

The alignment of the second line is off now, but will be fixed with
another series of mine[1]. In the meantime let's just assume that fix
will make it in eventually for the purposes of this patch, if it's
misaligned for a bit it doesn't matter much.

1. https://lore.kernel.org/git/cover-0.2-00000000000-20210901T110917Z-avarab@gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:58:00 -07:00
c5ead19ea2 am: fix incorrect exit status on am fail to abort
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:51:15 -07:00
42b5e09d1e t4151: add a few am --abort tests
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:51:15 -07:00
ea7dc012d2 git-am.txt: clarify --abort behavior
Both Johannes and I assumed (perhaps due to familiarity with rebase)
that am --abort would return the user to a clean state.  However, since
am, unlike rebase, is intended to be used within a dirty working tree,
--abort will only clean the files involved in the am operation.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:51:14 -07:00
bee8691f19 stash: restore untracked files AFTER restoring tracked files
If a user deletes a file and places a directory of untracked files
there, then stashes all these changes, the untracked directory of files
cannot be restored until after the corresponding file in the way is
removed.  So, restore changes to tracked files before restoring
untracked files.

There is no counterpart problem to worry about with the user deleting an
untracked file and then add a tracked one in its place.  Git does not
track untracked files, and so will not know the untracked file was
deleted, and thus won't be able to stash the removal of that file.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:46:34 -07:00
3d40e3723b stash: avoid feeding directories to update-index
When a file is removed from the cache, but there is a file of the same
name present in the working directory, we would normally treat that file
in the working directory as untracked.  However, in the case of stash,
doing that would prevent a simple 'git stash push', because the untracked
file would be in the way of restoring the deleted file.

git stash, however, blindly assumes that whatever is in the working
directory for a deleted file is wanted and passes that path along to
update-index.  That causes problems when the working directory contains
a directory with the same name as the deleted file.  Add some code for
this special case that will avoid passing directory names to
update-index.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:46:34 -07:00
4dbf7f30b1 t3903: document a pair of directory/file bugs
There are three tests here, because the second bug is documented with
two tests: a file -> directory change and a directory -> file change.
The reason for the two tests is just to verify that both are indeed
broken but that both will be fixed by the same simple change (which will
be provided in a subsequent patch).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:46:34 -07:00
1b421e7a5a docs/protocol-v2: point readers transport config discussion
We recently added tips for server admins to configure various transports
to support v2's GIT_PROTOCOL variable. While the protocol-v2 document is
pretty technical and not of interest to most admins, it may be a
starting point for them to figure out how to turn on v2. Let's put some
pointers from there to the other documentation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:35:00 -07:00
2834a72d5e docs/git: discuss server-side config for GIT_PROTOCOL
The v2 protocol requires that the GIT_PROTOCOL environment variable gets
passed around, but we don't have any documentation describing how this
is supposed to work. In particular, we need to note what server admins
might need to configure to make things work.

The definition of the GIT_PROTOCOL variable is probably the best place
for this, since:

  - we deal with multiple transports (ssh, http, etc).
    Transport-specific documentation (like the git-http-backend bits
    added in the previous commit) are helpful for those transports, but
    this gives a broader overview. Plus we do not have a specific
    transport endpoint program for ssh, so this is a reasonable place to
    mention it.

  - the server side of the protocol involves multiple programs. For now,
    upload-pack is the only endpoint which uses GIT_PROTOCOL, but that
    will likely expand in the future. We're better off with a central
    discussion of what the server admin might need to do. However, for
    discoverability, this patch adds a pointer from upload-pack's
    documentation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:34:59 -07:00
295d81b9e4 docs/http-backend: mention v2 protocol
Historically there was a little bit of configuration needed at the
webserver level in order to get the client's v2 protocol probes to Git.
But when we introduced the v2 protocol, we never documented these.

As of the previous commit, this should mostly work out of the box
without any explicit configuration. But it's worth documenting this to
make it clear how we expect it to work, especially in the face of
webservers which don't provide all headers over the CGI interface. Or
anybody who runs across this documentation but has an older version of
Git (or _used_ to have an older version, and wonders why they still have
a SetEnvIf line in their Apache config and whether it's still
necessary).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:34:59 -07:00
ff6a37c99e http-backend: handle HTTP_GIT_PROTOCOL CGI variable
When a client requests the v2 protocol over HTTP, they set the
Git-Protocol header. Webservers will generally make that available to
our CGI as HTTP_GIT_PROTOCOL in the environment. However, that's not
sufficient for upload-pack, etc, to respect it; they look in
GIT_PROTOCOL (without the HTTP_ prefix).

Either the webserver or the CGI is responsible for relaying that HTTP
header into the GIT_PROTOCOL variable. Traditionally, our tests have
configured the webserver to do so, but that's a burden on the server
admin. We can make this work out of the box by having the http-backend
CGI copy the contents of HTTP_GIT_PROTOCOL to GIT_PROTOCOL.

There are no new tests here. By removing the SetEnvIf line from our
test Apache config, we're now relying on this behavior of http-backend
to trigger the v2 protocol there (and there are numerous tests that fail
if this doesn't work).

There is one subtlety here: we copy HTTP_GIT_PROTOCOL only if there is
no existing GIT_PROTOCOL variable. That leaves the webserver admin free
to override the client's decision if they choose. This is unlikely to be
useful in practice, but is more flexible. And indeed, it allows the
v2-to-v0 fallback test added in the previous commit to continue working.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:34:59 -07:00
26146980f1 t5551: test v2-to-v0 http protocol fallback
Since we use the v2 protocol by default, the connection of a v2 client
to a v2 server is well covered by the test suite. And with the
GIT_TEST_PROTOCOL_VERSION knob, we can easily test a v0 client
connecting to a v2-aware server (which will then just speak v0). But we
have no regular tests that a v2 client, when encountering a non-v2-aware
server, will correctly fall back to using v0.

In theory this is a job for the cross-version tests in t/interop, but:

  - they cover only git:// and file:// clones

  - they are not part of the usual test suite, so nobody ever runs them
    anyway

Since using v2 over http requires configuring the web server to pass
along the Git-Protocol header, we can easily create a situation where
the server does not respect the v2 probe, and the conversation falls
back to v0.

This works just fine. This new test is not about fixing any particular
bug, but just making sure that the system works (and continues to work)
as expected.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 15:34:58 -07:00
6346f704a0 index-pack: use xopen in init_thread
Support an arbitrary file descriptor expression in the semantic patch
for replacing open+die_errno with xopen, not just an identifier, and
apply it.  This makes the error message at the single affected place
more consistent and reduces code duplication.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:22:50 -07:00
1bfb57f642 ssh signing: test that gpg fails for unknown keys
Test that verify-commit/tag will fail when a gpg key is completely
unknown. To do this we have to generate a key, use it for a signature
and delete it from our keyring aferwards completely.

Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:15:53 -07:00
f265f2d630 ssh signing: tests for logs, tags & push certs
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:15:53 -07:00
3326a783f1 ssh signing: duplicate t7510 tests for commits
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:15:52 -07:00
facca53ac3 ssh signing: verify signatures using ssh-keygen
To verify a ssh signature we first call ssh-keygen -Y find-principal to
look up the signing principal by their public key from the
allowedSignersFile. If the key is found then we do a verify. Otherwise
we only validate the signature but can not verify the signers identity.

Verification uses the gpg.ssh.allowedSignersFile (see ssh-keygen(1) "ALLOWED
SIGNERS") which contains valid public keys and a principal (usually
user@domain). Depending on the environment this file can be managed by
the individual developer or for example generated by the central
repository server from known ssh keys with push access. This file is usually
stored outside the repository, but if the repository only allows signed
commits/pushes, the user might choose to store it in the repository.

To revoke a key put the public key without the principal prefix into
gpg.ssh.revocationKeyring or generate a KRL (see ssh-keygen(1)
"KEY REVOCATION LISTS"). The same considerations about who to trust for
verification as with the allowedSignersFile apply.

Using SSH CA Keys with these files is also possible. Add
"cert-authority" as key option between the principal and the key to mark
it as a CA and all keys signed by it as valid for this CA.
See "CERTIFICATES" in ssh-keygen(1).

Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:15:52 -07:00
4838f62c8c ssh signing: provide a textual signing_key_id
For ssh the user.signingkey can be a filename/path or even a literal ssh pubkey.
In push certs and textual output we prefer the ssh fingerprint instead.

Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:15:52 -07:00
fd9e226776 ssh signing: retrieve a default key from ssh-agent
If user.signingkey is not set and a ssh signature is requested we call
gpg.ssh.defaultKeyCommand (typically "ssh-add -L") and use the first key we get

Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:15:52 -07:00
29b315778e ssh signing: add ssh key format and signing code
Implements the actual sign_buffer_ssh operation and move some shared
cleanup code into a strbuf function

Set gpg.format = ssh and user.signingkey to either a ssh public key
string (like from an authorized_keys file), or a ssh key file.
If the key file or the config value itself contains only a public key
then the private key needs to be available via ssh-agent.

gpg.ssh.program can be set to an alternative location of ssh-keygen.
A somewhat recent openssh version (8.2p1+) of ssh-keygen is needed for
this feature. Since only ssh-keygen is needed it can this way be
installed seperately without upgrading your system openssh packages.

Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:15:51 -07:00
64625c728f ssh signing: add test prereqs
Generate some ssh keys and a allowedSignersFile for testing

Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:15:51 -07:00
b5726a5d9c ssh signing: preliminary refactoring and clean-up
Openssh v8.2p1 added some new options to ssh-keygen for signature
creation and verification. These allow us to use ssh keys for git
signatures easily.

In our corporate environment we use PIV x509 Certs on Yubikeys for email
signing/encryption and ssh keys which I think is quite common
(at least for the email part). This way we can establish the correct
trust for the SSH Keys without setting up a separate GPG Infrastructure
(which is still quite painful for users) or implementing x509 signing
support for git (which lacks good forwarding mechanisms).
Using ssh agent forwarding makes this feature easily usable in todays
development environments where code is often checked out in remote VMs / containers.
In such a setup the keyring & revocationKeyring can be centrally
generated from the x509 CA information and distributed to the users.

To be able to implement new signing formats this commit:
 - makes the sigc structure more generic by renaming "gpg_output" to
   "output"
 - introduces function pointers in the gpg_format structure to call
   format specific signing and verification functions
 - moves format detection from verify_signed_buffer into the check_signature
   api function and calls the format specific verify
 - renames and wraps sign_buffer to handle format specific signing logic
   as well

Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 14:15:51 -07:00
8b7c11b866 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10 11:47:10 -07:00
96ac07f4a9 Merge branch 'ab/help-autocorrect-prompt'
The logic for auto-correction of misspelt subcommands learned to go
interactive when the help.autocorrect configuration variable is set
to 'prompt'.

* ab/help-autocorrect-prompt:
  help.c: help.autocorrect=prompt waits for user action
2021-09-10 11:46:33 -07:00
613204b948 Merge branch 'cb/ci-build-pedantic'
CI update.

* cb/ci-build-pedantic:
  ci: run a pedantic build as part of the GitHub workflow
2021-09-10 11:46:32 -07:00
9762646ee4 Merge branch 'gh/gitweb-branch-sort'
Tie-break branches that point at the same object in the list of
branches on GitWeb to show the one pointed at by HEAD early.

* gh/gitweb-branch-sort:
  gitweb: use HEAD as secondary sort key in git_get_heads_list()
2021-09-10 11:46:32 -07:00
05665a0dff Merge branch 'rs/archive-use-object-id'
Code cleanup.

* rs/archive-use-object-id:
  archive: convert queue_directory to struct object_id
2021-09-10 11:46:31 -07:00
09f66eb0e2 Merge branch 'rs/show-branch-simplify'
Code cleanup.

* rs/show-branch-simplify:
  show-branch: simplify rev_is_head()
2021-09-10 11:46:31 -07:00
bfe37f3dc5 Merge branch 'jk/log-warn-on-bogus-encoding'
Doc update plus improved error reporting.

* jk/log-warn-on-bogus-encoding:
  docs: use "character encoding" to refer to commit-object encoding
  logmsg_reencode(): warn when iconv() fails
2021-09-10 11:46:30 -07:00
a4b1a0ade4 Merge branch 'cb/remote-ndebug-fix'
Build fix.

* cb/remote-ndebug-fix:
  remote: avoid -Wunused-but-set-variable in gcc with -DNDEBUG
2021-09-10 11:46:30 -07:00
fd0d7036e0 Merge branch 'ab/retire-advice-config'
Code clean up to migrate callers from older advice_config[] based
API to newer advice_if_enabled() and advice_enabled() API.

* ab/retire-advice-config:
  advice: move advice.graftFileDeprecated squashing to commit.[ch]
  advice: remove use of global advice_add_embedded_repo
  advice: remove read uses of most global `advice_` variables
  advice: add enum variants for missing advice variables
2021-09-10 11:46:29 -07:00
6d09fc54f6 Merge branch 'mk/clone-recurse-submodules'
After "git clone --recurse-submodules", all submodules are cloned
but they are not by default recursed into by other commands.  With
submodule.stickyRecursiveClone configuration set, submodule.recurse
configuration is set to true in a repository created by "clone"
with "--recurse-submodules" option.

* mk/clone-recurse-submodules:
  clone: set submodule.recurse=true if submodule.stickyRecursiveClone enabled
2021-09-10 11:46:29 -07:00
f0b567898b Merge branch 'ab/mailmap-leakfix'
Leakfix.

* ab/mailmap-leakfix:
  mailmap.c: fix a memory leak in free_mailap_{info,entry}()
2021-09-10 11:46:28 -07:00
02d263277a Merge branch 'ab/gc-log-rephrase'
A pathname in an advice message has been made cut-and-paste ready.

* ab/gc-log-rephrase:
  gc: remove trailing dot from "gc.log" line
2021-09-10 11:46:28 -07:00
6dbe1b4ee2 Merge branch 'uk/userdiff-php-enum'
Update the userdiff pattern for PHP.

* uk/userdiff-php-enum:
  userdiff: support enum keyword in PHP hunk header
2021-09-10 11:46:27 -07:00
febba8038d Merge branch 'tk/fast-export-anonymized-tag-fix'
The output from "git fast-export", when its anonymization feature
is in use, showed an annotated tag incorrectly.

* tk/fast-export-anonymized-tag-fix:
  fast-export: fix anonymized tag using original length
2021-09-10 11:46:27 -07:00
cd6a1fc8a2 Merge branch 'ba/object-info'
Leakfix.

* ba/object-info:
  protocol-caps.c: fix memory leak in send_info()
2021-09-10 11:46:26 -07:00
1396a95ee0 Merge branch 'ab/commit-graph-usage'
Fixes on usage message from "git commit-graph".

* ab/commit-graph-usage:
  commit-graph: show "unexpected subcommand" error
  commit-graph: show usage on "commit-graph [write|verify] garbage"
  commit-graph: early exit to "usage" on !argc
  multi-pack-index: refactor "goto usage" pattern
  commit-graph: use parse_options_concat()
  commit-graph: remove redundant handling of -h
  commit-graph: define common usage with a macro
2021-09-10 11:46:25 -07:00
bd29bcf913 Merge branch 'mh/send-email-reset-in-reply-to'
Even when running "git send-email" without its own threaded
discussion support, a threading related header in one message is
carried over to the subsequent message to result in an unwanted
threading, which has been corrected.

* mh/send-email-reset-in-reply-to:
  send-email: avoid incorrect header propagation
2021-09-10 11:46:25 -07:00
0538a2586e Merge branch 'rs/more-fspathcmp'
Code simplification.

* rs/more-fspathcmp:
  merge-recursive: use fspathcmp() in path_hashmap_cmp()
2021-09-10 11:46:24 -07:00
e4897d2bde Merge branch 'sg/set-ceiling-during-tests'
Buggy tests could damage repositories outside the throw-away test
area we created.  We now by default export GIT_CEILING_DIRECTORIES
to limit the damage from such a stray test.

* sg/set-ceiling-during-tests:
  test-lib: set GIT_CEILING_DIRECTORIES to protect the surrounding repository
2021-09-10 11:46:24 -07:00
6f9e7cadf3 Merge branch 'jh/sparse-index-resize-fix'
The sparse-index support can corrupt the index structure by storing
a stale and/or uninitialized data, which has been corrected.

* jh/sparse-index-resize-fix:
  sparse-index: copy dir_hash in ensure_full_index()
2021-09-10 11:46:23 -07:00
b4ceeef962 Merge branch 'es/walken-tutorial-fix'
Typofix.

* es/walken-tutorial-fix:
  doc: fix syntax error and the format of printf
2021-09-10 11:46:23 -07:00
9559de3b66 Merge branch 'tb/add-objects-in-unpacked-packs-simplify'
Code simplification with reduced memory usage.

* tb/add-objects-in-unpacked-packs-simplify:
  builtin/pack-objects.c: remove duplicate hash lookup
  builtin/pack-objects.c: simplify add_objects_in_unpacked_packs()
  object-store.h: teach for_each_packed_object to ignore kept packs
2021-09-10 11:46:21 -07:00
87d4aed743 Merge branch 'ps/fetch-omit-formatting-under-quiet'
"git fetch --quiet" optimization to avoid useless computation of
info that will never be displayed.

* ps/fetch-omit-formatting-under-quiet:
  fetch: skip formatting updated refs with `--quiet`
2021-09-10 11:46:20 -07:00
1ab13eb973 Merge branch 'ka/want-ref-in-namespace'
"git upload-pack" which runs on the other side of "git fetch"
forgot to take the ref namespaces into account when handling
want-ref requests.

* ka/want-ref-in-namespace:
  docs: clarify the interaction of transfer.hideRefs and namespaces
  upload-pack.c: treat want-ref relative to namespace
  t5730: introduce fetch command helper
2021-09-10 11:46:20 -07:00
173368d73d Merge branch 'zh/cherry-pick-advice'
The advice message that "git cherry-pick" gives when it asks
conflicted replay of a commit to be resolved by the end user has
been updated.

* zh/cherry-pick-advice:
  cherry-pick: use better advice message
2021-09-10 11:46:19 -07:00
6c083b7619 Merge branch 'js/advise-when-skipping-cherry-picked'
"git rebase" by default skips changes that are equivalent to
commits that are already in the history the branch is rebased onto;
give messages when this happens to let the users be aware of
skipped commits, and also teach them how to tell "rebase" to keep
duplicated changes.

* js/advise-when-skipping-cherry-picked:
  sequencer: advise if skipping cherry-picked commit
2021-09-10 11:46:19 -07:00
4bc1fd6e39 pack-objects: rename .idx files into place after .bitmap files
In preceding commits the race of renaming .idx files in place before
.rev files and other auxiliary files was fixed in pack-write.c's
finish_tmp_packfile(), builtin/repack.c's "struct exts", and
builtin/index-pack.c's final(). As noted in the change to pack-write.c
we left in place the issue of writing *.bitmap files after the *.idx,
let's fix that issue.

See 7cc8f97108 (pack-objects: implement bitmap writing, 2013-12-21)
for commentary at the time when *.bitmap was implemented about how
those files are written out, nothing in that commit contradicts what's
being done here.

Note that this commit and preceding ones only close any race condition
with *.idx files being written before their auxiliary files if we're
optimistic about our lack of fsync()-ing in this are not tripping us
over. See the thread at [1] for a rabbit hole of various discussions
about filesystem races in the face of doing and not doing fsync() (and
if doing fsync(), not doing it properly).

We may want to fsync the containing directory once after renaming the
*.idx file into place, but that is outside the scope of this series.

1. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 18:23:11 -07:00
2ec02dd5a8 pack-write: split up finish_tmp_packfile() function
Split up the finish_tmp_packfile() function and use the split-up version
in pack-objects.c in preparation for moving the step of renaming the
*.idx file later as part of a function change.

Since the only other caller of finish_tmp_packfile() was in
bulk-checkin.c, and it won't be needing a change to its *.idx renaming,
provide a thin wrapper for the old function as a static function in that
file. If other callers end up needing the simpler version it could be
moved back to "pack-write.c" and "pack.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 18:23:11 -07:00
522a5c2cf5 builtin/index-pack.c: move .idx files into place last
In a similar spirit as preceding patches to `git repack` and `git
pack-objects`, fix the identical problem in `git index-pack`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 18:23:11 -07:00
8737dab346 index-pack: refactor renaming in final()
Refactor the renaming in final() into a helper function, this is
similar in spirit to a preceding refactoring of finish_tmp_packfile()
in pack-write.c.

Before e37d0b8730 (builtin/index-pack.c: write reverse indexes,
2021-01-25) it probably wasn't worth it to have this sort of helper,
due to the differing "else if" case for "pack" files v.s. "idx" files.

But since we've got "rev" as well now, let's do the renaming via a
helper, this is both a net decrease in lines, and improves the
readability, since we can easily see at a glance that the logic for
writing these three types of files is exactly the same, aside from the
obviously differing cases of "*final_name" being NULL, and
"make_read_only_if_same" being different.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 18:23:11 -07:00
4e58cedd94 builtin/repack.c: move .idx files into place last
In a similar spirit as the previous patch, fix the identical problem
from `git repack` (which invokes `pack-objects` with a temporary
location for output, and then moves the files into their final locations
itself).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 18:23:11 -07:00
16a86907bc pack-write.c: rename .idx files after *.rev
We treat the presence of an `.idx` file as the indicator of whether or
not it's safe to use a packfile. But `finish_tmp_packfile()` (which is
responsible for writing the pack and moving the temporary versions of
all of its auxiliary files into place) is inconsistent about the write
order.

Specifically, it moves the `.rev` file into place after the `.idx`,
leaving open the possibility to open a pack which looks "ready" (because
the `.idx` file exists and is readable) but appears momentarily to not
have a `.rev` file. This causes Git to fall back to generating the
pack's reverse index in memory.

Though racy, this amounts to an unnecessary slow-down at worst, and
doesn't affect the correctness of the resulting reverse index.

Close this race by moving the .rev file into place before moving the
.idx file into place.

This still leaves the issue of `.idx` files being renamed into place
before the auxiliary `.bitmap` file is renamed when in pack-object.c's
write_pack_file() "write_bitmap_index" is true. That race will be
addressed in subsequent commits.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 18:23:11 -07:00
66833f0e70 pack-write: refactor renaming in finish_tmp_packfile()
Refactor the renaming in finish_tmp_packfile() into a helper function.
The callers are now expected to pass a "name_buffer" ending in
"pack-OID." instead of the previous "pack-", we then append "pack",
"idx" or "rev" to it.

By doing the strbuf_setlen() in rename_tmp_packfile() we reuse the
buffer and avoid the repeated allocations we'd get if that function had
its own temporary "struct strbuf".

This approach of reusing the buffer does make the last user in
pack-object.c's write_pack_file() slightly awkward, since we needlessly
do a strbuf_setlen() before calling strbuf_release() for consistency. In
subsequent changes we'll move that bitmap writing code around, so let's
not skip the strbuf_setlen() now.

The previous strbuf_reset() idiom originated with 5889271114
(finish_tmp_packfile():use strbuf for pathname construction,
2014-03-03), which in turn was a minimal adjustment of pre-strbuf code
added in 0e990530ae (finish_tmp_packfile(): a helper function,
2011-10-28).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 18:23:11 -07:00
ae44b5a4f3 bulk-checkin.c: store checksum directly
finish_bulk_checkin() stores the checksum from finalize_hashfile() by
writing to the `hash` member of `struct object_id`, but that hash has
nothing to do with an object id (it's just a convenient location that
happens to be sized correctly).

Store the hash directly in an unsigned char array. This behaves the same
as writing to the `hash` member, but makes the intent clearer (and
avoids allocating an extra four bytes for the `algo` member of `struct
object_id`). It likewise prevents the possibility of a segfault when
reading `algo` (e.g., by calling `oid_to_hex()`) if it is uninitialized.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 18:23:11 -07:00
e8f55568de t5562: use alarm() to interrupt timed child-wait
The t5562 script occasionally takes 60 extra seconds to complete due to
a race condition in the invoke-with-content-length.pl helper.

The way it's supposed to work is this:

  - we set up a SIGCLD handler

  - we kick off http-backend and write to it with a set content-length,
    but _don't_ close the pipe

  - we sleep for 60 seconds, assuming that SIGCLD from http-backend
    finishing will interrupt us

  - after the sleep finishes (whetherby 60 seconds or because it was
    interrupted by the signal), we check a flag to see if our SIGCLD
    handler was called. If not, then we complain.

This usually completes immediately, because the signal interrupts our
sleep. But very occasionally the child process dies _before_ we hit the
sleep, so we don't realize it. The test still completes successfully
(because our $exited flag is set), but it takes an extra 60 seconds.

There's no way to check the flag and sleep atomically. So the best we
can do with this approach is to sleep in smaller chunks (say, 1 second)
and check the flag incrementally. Then we waste a maximum of 1 second if
we lose the race. This was proposed in:

  https://lore.kernel.org/git/20190218205028.32486-1-max@max630.net/

and it does work. But we can do better.

Instead of blocking on sleep and waiting for the child signal to
interrupt us, we can block on the child exiting and set an alarm signal
to trigger the timeout.

This lets us exit the script immediately when the child behaves (with no
race possible), and wait a maximum of 60 seconds when it doesn't.

Note one small subtlety: perl is very willing to restart the waitpid()
call after the alarm is delivered, even if we've thrown an exception via
die. "perldoc -f alarm" claims you can get around this with an eval/die
combo (and even has some example code), but it doesn't seem to work for
me with waitpid(); instead, we continue waiting until the child exits.

So instead, we'll instruct the child process to exit in the alarm
handler itself. In the original code this was done by calling
close($out). That would continue to work, since our child is always
http-backend, which should exit when its stdin closes. But we can be
even more robust against a hung or confused child by sending a KILL
signal, which should terminate it immediately.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 17:45:43 -07:00
35cf94eaf6 refs/files-backend: remove unused open mode parameter
We only need to provide a mode if we are willing to let open(2) create
the file, which is not the case here, so drop the unnecessary parameter.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 17:40:28 -07:00
d9a65b6c0a setup: use xopen and xdup in sanitize_stdfds
Replace the catch-all error message with specific ones for opening and
duplicating by calling the wrappers xopen and xdup.  The code becomes
easier to follow when error handling is reduced to two letters.

Remove the unnecessary mode parameter while at it -- we expect /dev/null
to already exist.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 17:40:02 -07:00
73cd7d9420 pack-bitmap: drop bitmap_index argument from try_partial_reuse()
Starting in commit 0f533c7284 (pack-bitmap: read multi-pack bitmaps,
2021-08-31), we no longer look at the "struct bitmap_index" passed to
try_partial_reuse(). This is because we only handle verbatim reuse from
a single pack: either the pack whose bitmap we're looking at, or the
"preferred" pack of a midx bitmap. And thus the primary item we look at
is the "pack" parameter added by that same commit, and not the
bitmap_git->pack parameter (which would be NULL for a midx bitmap). It's
our caller, reuse_partial_packfile_from_bitmap(), which decides which
pack to use and passes it in to us.

Drop the unused parameter to prevent confusion.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 17:32:40 -07:00
bfbb60d328 pack-bitmap: drop repository argument from prepare_midx_bitmap_git()
We never look at the repository argument which is passed. This makes
sense, since the multi_pack_index struct already tells us everything we
need to access the files in its associated object directory.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 17:32:37 -07:00
516680ba77 sparse-index: integrate with cherry-pick and rebase
The hard work was already done with 'git merge' and the ORT strategy.
Just add extra tests to see that we get the expected results in the
non-conflict cases.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 15:49:05 -07:00
5d9c9349bd sequencer: ensure full index if not ORT strategy
The sequencer is used by 'cherry-pick' and 'rebase' to sequence a list
of operations that modify the index. Since we intend to remove the need
for 'command_requires_full_index', we need to ensure the sparse index is
expanded every time it is written to disk between these steps. That is,
unless the merge strategy is 'ort' where the index can remain sparse
throughout.

There are two main places to be extra careful about a full index:

1. Right before calling merge_trees(), ensure the index is full. This
   happens within an 'else' where the 'if' block checks if the 'ort'
   strategy is selected.

2. During read_and_refresh_cache(), the index might be written to disk
   and converted to sparse in the process. Ensure it expands back to
   full afterwards by checking if the strategy is _not_ 'ort'. This
   'if' statement is the logical negation of the 'if' in item (1).

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 15:49:05 -07:00
c0b99303db t1092: add cherry-pick, rebase tests
Add tests to check that cherry-pick and rebase behave the same in the
sparse-index case as in the full index cases. These tests are agnostic
to GIT_TEST_MERGE_ALGORITHM, so a full CI test suite will check both the
'ort' and 'recursive' strategies on this test.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 15:49:04 -07:00
6957636792 merge-ort: expand only for out-of-cone conflicts
Merge conflicts happen often enough to want to avoid expanding a sparse
index when they happen, as long as those conflicts are within the
sparse-checkout cone. If a conflict exists outside of the
sparse-checkout cone, then we still need to expand before iterating over
the index entries. This is critical to do in advance because of how the
original_cache_nr is tracked to allow inserting and replacing cache
entries.

Iterate over the conflicted files and check if any paths are outside of
the sparse-checkout cone. If so, then expand the full index.

Add a test that demonstrates that we do not expand the index, even when
we hit a conflict within the sparse-checkout cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 15:49:04 -07:00
a33806398a merge: make sparse-aware with ORT
Allow 'git merge' to operate without expanding a sparse index, at least
not immediately. The index still will be expanded in a few cases:

1. If the merge strategy is 'recursive', then we enable
   command_requires_full_index at the start of the merge_recursive()
   method. We expect sparse-index users to also have the 'ort' strategy
   enabled.

2. With the 'ort' strategy, if the merge results in a conflicted file,
   then we expand the index before updating the working tree. The loop
   that iterates over the worktree replaces index entries and tracks
   'origintal_cache_nr' which can become completely wrong if the index
   expands in the middle of the operation. This safety valve is
   important before that loop starts. A later change will focus this
   to only expand if we indeed have a conflict outside of the
   sparse-checkout cone.

3. Other merge strategies are executed as a 'git merge-X' subcommand,
   and those strategies are currently protected with the
   'command_requires_full_index' guard.

Some test updates are required, including a mistaken 'git checkout -b'
that did not specify the base branch, causing merges to be fast-forward
merges.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 15:49:04 -07:00
ad90da7351 diff: ignore sparse paths in diffstat
The diff_populate_filespec() method is used to describe the diff after a
merge operation is complete. In order to avoid expanding a sparse index,
the reuse_worktree_file() needs to be adapted to ignore files that are
outside of the sparse-checkout cone. The file names and OIDs used for
this check come from the merged tree in the case of the ORT strategy,
not the index, hence the ability to look into these paths without having
already expanded the index.

The work done by reuse_worktree_file() is only an optimization, and
requires the file being on disk for it to be of any value. Thus, it is
safe to exit the method early if we do not expect the file on disk.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 15:49:04 -07:00
10a0d6ae64 revision: remove "submodule" from opt struct
Clean up a TODO in revision.h by removing the "submodule" field from
struct setup_revision_opt. This field is only used to specify the ref
store to use, so use rev_info->repo to determine the ref store instead.

The only users of this field are merge-ort.c and merge-recursive.c.
However, both these files specify the superproject as rev_info->repo and
the submodule as setup_revision_opt->submodule. In order to be able to
pass the submodule as rev_info->repo, all commits must be parsed with
the submodule explicitly specified; this patch does that as well. (An
incremental solution in which only some commits are parsed with explicit
submodule will not work, because if the same commit is parsed twice in
different repositories, there will be 2 heap-allocated object structs
corresponding to that commit, and any flag set by the revision walking
mechanism on one of them will not be reflected onto the other.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 14:09:30 -07:00
8eb8dcf946 repository: support unabsorbed in repo_submodule_init
In preparation for a subsequent commit that migrates code using
add_submodule_odb() to repo_submodule_init(), teach
repo_submodule_init() to support submodules with unabsorbed gitdirs.
(See the documentation for "git submodule absorbgitdirs" for more
information about absorbed and unabsorbed gitdirs.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 14:09:30 -07:00
5df5106e1e submodule: remove unnecessary unabsorbed fallback
In get_submodule_repo_for(), there is a fallback code path for the case
in which a submodule has an unabsorbed gitdir. (See the documentation
for "git submodule absorbgitdirs" for more information about absorbed
and unabsorbed gitdirs.) However, this code path is unnecessary, because
such submodules are already handled: when the fetch_task is created in
fetch_task_create(), it will create its own struct submodule with a path
and name, and repo_submodule_init() can handle such a struct.

This fallback was introduced in 26f80ccfc1 ("submodule: migrate
get_next_submodule to use repository structs", 2018-12-05). It was
unnecessary even then, but perhaps it escaped notice because its parent
commit d5498e0871 ("repository: repo_submodule_init to take a submodule
struct", 2018-12-05) was the one that taught repo_submodule_init() to
handle such created structs. Before, it took a path and always checked
.gitmodules, so it truly would have failed if there were no entry in
.gitmodules.

(Note to reviewers: in 26f80ccfc1, the "own struct submodule" I
mentioned is in get_next_submodule(), not fetch_task_create().)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 14:09:30 -07:00
c4dee2c085 Close object store closer to spawning child processes
In many cases where we spawned child processes that _may_ trigger a
repack, we explicitly closed the object store first (so that the
`repack` process can delete the `.pack` files, which would otherwise not
be possible on Windows since files cannot be deleted as long as they as
still in use).

Wherever possible, we now use the new `close_object_store` bit of the
`run_command()` API, to delay closing the object store even further.
This makes the code easier to maintain because it is now more obvious
that we only release those file handles because of those child
processes.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 12:56:11 -07:00
5a22a334cb run_auto_maintenance(): implicitly close the object store
Before spawning the auto maintenance, we need to make sure that we
release all open file handles to all the `.pack` files (and MIDX files
and commit-graph files and...) so that the maintenance process has the
freedom to delete those files.

So far, we did this manually every time before calling
`run_auto_maintenance()`. With the new `close_object_store` flag, we can
do that implicitly in that function, which is more robust because future
callers won't be able to forget to close the object store.

Note: this changes behavior slightly, as we previously _always_ closed
the object store, but now we only close the object store when actually
running the auto maintenance. In practice, this should not matter (if
anything, it might speed up operations where auto maintenance is
disabled).

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 12:56:11 -07:00
28d04e1ec1 run-command: offer to close the object store before running
Especially on Windows, where files cannot be deleted if _any_ process
holds an open file handle to them, it is important to close the object
store (releasing all handles to all `.pack` files) before running a
command that might spawn a garbage collection.

This scenario is so common that we frequently see the pattern of closing
the object store before running auto maintenance or another Git command.

Let's make this much more convenient by teaching the `run_command()`
machinery a new flag to release the object store before spawning the
process.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 12:56:11 -07:00
3322a9d87f run-command: prettify the RUN_COMMAND_* flags
The values were listed unaligned, and with powers of two spelled out in
decimal. The list is easier to parse for human readers if the numbers
are aligned and spelled out as powers of two (using the bit-shift
operator `<<`).

While at it, remove a code comment that was unclear at best, and
confusing at worst.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 12:56:11 -07:00
a788d31931 ci: new github-action for git-l10n code review
The repository of git-l10n is a fork of "git/git" on GitHub, and uses
GitHub pull request for code review. A helper program "git-po-helper"
can be used to check typos in ".po" files, validate syntax, and check
commit messages. It would be convenient to integrate this helper program
to CI and add comments in pull request.

The new github-action workflow will be enabled for l10n related
operations, such as:

 * Operations on a repository named as "git-po", such as a repository
   forked from "git-l10n/git-po".

 * Push to a branch that contains "l10n" in the name.

 * Pull request from a remote branch which has "l10n" in the name, such
   as: "l10n/fix-fuzzy-translations".

The new l10n workflow listens to two types of github events:

    on: [push, pull_request_target]

The reason we use "pull_request_target" instead of "pull_request" is
that pull requests from forks receive a read-only GITHUB_TOKEN and
workflows cannot write comments back to pull requests for security
reasons. GitHub provides a "pull_request_target" event to resolve
security risks by checking out the base commit from the target
repository, and provide write permissions for the workflow.

By default, administrators can set strict permissions for workflows. The
following code is used to modify the permissions for the GITHUB_TOKEN
and grant write permission in order to create comments in pull-requests.

    permissions:
      pull-requests: write

This workflow will scan commits one by one. If a commit does not look
like a l10n commit (no file in "po/" has been changed), the scan process
will stop immediately. For a "push" event, no error will be reported
because it is normal to push non-l10n commits merged from upstream. But
for the "pull_request_target" event, errors will be reported. For this
reason, additional option is provided for "git-po-helper".

    git-po-helper check-commits \
        --github-action-event="${{ github.event_name }}" -- \
        <base>..<head>

The output messages of "git-po-helper" contain color codes not only for
console, but also for logfile. This is because "git-po-helper" uses a
package named "logrus" for logging, and I use an additional option
"ForceColor" to initialize "logrus" to print messages in a user-friendly
format in logfile output. These color codes help produce beautiful
output for the log of workflow, but they must be stripped off when
creating comments for pull requests. E.g.:

    perl -pe 's/\e\[[0-9;]*m//g' git-po-helper.out

"git-po-helper" may generate two kinds of suggestions, errors and
warnings. All the errors and warnings will be reported in the log of the
l10n workflow. However, warnings in the log of the workflow for a
successfully running "git-po-helper" can easily be ignored by users.
For the "pull_request_target" event, this issue is resolved by creating
an additional comment in the pull request. A l10n contributor should try
to fix all the errors, and should pay attention to the warnings.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 12:55:21 -07:00
0c41a887b4 pack.h: line-wrap the definition of finish_tmp_packfile()
Line-wrap the definition of finish_tmp_packfile() to make subsequent
changes easier to read. See 0e990530ae (finish_tmp_packfile(): a
helper function, 2011-10-28) for the commit that introduced this
overly long line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 10:51:57 -07:00
bf6d819bc1 entry: show finer-grained counter in "Filtering content" progress line
The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1
    mode[1] helps spot this issue. Under it the 'missing file in
    delayed checkout' and 'invalid file in delayed checkout' tests in
    't0021-conversion.sh' fail, because both use only one
    filter.  (The test 'delayed checkout in process filter' uses two
    filters but the first one does all the work, so that test already
    happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1
assertion discussed above, but the 'missing file in delayed checkout'
test would still fail.

It'll fail because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all, see [2] for how to
spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not
straightforward to fix it with the current progress.c library (see [3]
for an attempt), so let's leave it for now.

Let's also initialize the *progress to "NULL" while we're at it. Since
7a132c628e (checkout: make delayed checkout respect --quiet and
--no-progress, 2021-08-26) we have had progress conditional on
"show_progress", usually we use the idiom of a "NULL" initialization
of the "*progress", rather than the more verbose ternary added in
7a132c628e.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev
3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 09:58:19 -07:00
4011224944 commit-graph: fix bogus counter in "Scanning merged commits" progress line
The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N. Fix this by passing 'i + 1' to display_progress(), like
most other callsites do.

There's an RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] which
catches this issue in the 'fetch.writeCommitGraph' and
'fetch.writeCommitGraph with submodules' tests in
't5510-fetch.sh'. The GIT_TEST_CHECK_PROGRESS=1 mode is not part of
this series, but future changes to progress.c may add it or similar
assertions to catch this and similar bugs elsewhere.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 09:58:19 -07:00
8463beaeb6 The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 13:30:34 -07:00
cfba19618f Merge branch 'sg/column-nl'
The parser for the "--nl" option of "git column" has been
corrected.

* sg/column-nl:
  column: fix parsing of the '--nl' option
2021-09-08 13:30:34 -07:00
f7ab826740 Merge branch 'cb/makefile-apple-clang'
Build update for Apple clang.

* cb/makefile-apple-clang:
  build: catch clang that identifies itself as "$VENDOR clang"
  build: clang version may not be followed by extra words
  build: update detect-compiler for newer Xcode version
2021-09-08 13:30:33 -07:00
7ad8ddecfd Merge branch 'ps/ls-refs-strbuf-optim'
Micro-optimization for the wire protocol driver.

* ps/ls-refs-strbuf-optim:
  ls-refs: reuse buffer when sending refs
2021-09-08 13:30:33 -07:00
ec8d24f05d Merge branch 'rs/branch-allow-deleting-dangling'
"git branch -D <branch>" used to refuse to remove a broken branch
ref that points at a missing commit, which has been corrected.

* rs/branch-allow-deleting-dangling:
  branch: allow deleting dangling branches with --force
2021-09-08 13:30:32 -07:00
f0d795428e Merge branch 'mt/quiet-with-delayed-checkout'
The delayed checkout code path in "git checkout" etc. were chatty
even when --quiet and/or --no-progress options were given.

* mt/quiet-with-delayed-checkout:
  checkout: make delayed checkout respect --quiet and --no-progress
2021-09-08 13:30:32 -07:00
7b06222619 Merge branch 'rs/xopen-reports-open-failures'
Error diagnostics improvement.

* rs/xopen-reports-open-failures:
  use xopen() to handle fatal open(2) failures
  xopen: explicitly report creation failures
2021-09-08 13:30:32 -07:00
c8f491668e Merge branch 'dd/diff-files-unmerged-fix'
"git diff --relative" segfaulted and/or produced incorrect result
when there are unmerged paths.

* dd/diff-files-unmerged-fix:
  diff-lib: ignore paths that are outside $cwd if --relative asked
2021-09-08 13:30:31 -07:00
85246a7054 Merge branch 'dd/t6300-wo-gpg-fix'
Test fix.

* dd/t6300-wo-gpg-fix:
  t6300: check for cat-file exit status code
  t6300: don't run cat-file on non-existent object
2021-09-08 13:30:31 -07:00
efae5c2e22 Merge branch 'mh/credential-leakfix'
Leak fix.

* mh/credential-leakfix:
  credential: fix leak in credential_apply_config()
2021-09-08 13:30:30 -07:00
a20a40e3b6 Merge branch 'jk/t5323-no-pack-test-fix'
Test fix.

* jk/t5323-no-pack-test-fix:
  t5323: drop mentions of "master"
2021-09-08 13:30:30 -07:00
4293c057dc Merge branch 'js/maintenance-launchctl-fix'
"git maintenance" scheduler fix for macOS.

* js/maintenance-launchctl-fix:
  maintenance: skip bootout/bootstrap when plist is registered
  maintenance: create `launchctl` configuration using a lock file
2021-09-08 13:30:29 -07:00
31e4a0db03 Merge branch 'ab/rebase-fatal-fatal-fix'
Error message fix.

* ab/rebase-fatal-fatal-fix:
  rebase: emit one "fatal" in "fatal: fatal: <error>"
2021-09-08 13:30:29 -07:00
63ddde68cd Merge branch 'ab/ls-remote-packet-trace'
Debugging aid fix.

* ab/ls-remote-packet-trace:
  ls-remote: set packet_trace_identity(<name>)
2021-09-08 13:30:28 -07:00
ce7ae09bd4 Merge branch 'rs/git-mmap-uses-malloc'
mmap() imitation used to call xmalloc() that dies upon malloc()
failure, which has been corrected to just return an error to the
caller to be handled.

* rs/git-mmap-uses-malloc:
  compat: let git_mmap use malloc(3) directly
2021-09-08 13:30:27 -07:00
e18f4de927 Merge branch 'ga/send-email-sendmail-cmd'
Test fix.

* ga/send-email-sendmail-cmd:
  t9001: PATH must not use Windows-style paths
2021-09-08 13:30:27 -07:00
03137a4804 Merge branch 'me/t5582-cleanup'
Test fix.

* me/t5582-cleanup:
  t5582: remove spurious 'cd "$D"' line
2021-09-08 13:30:26 -07:00
7e44ff7a39 pull: release packs before fetching
On Windows, files cannot be removed nor renamed if there are still
handles held by a process. To remedy that, we try to release all open
handles to any `.pack` file before e.g. repacking (which would want to
remove the original `.pack` file(s) after it is done).

Since the `read_cache_unmerged()` and/or the `get_oid()` call in `git
pull` can cause `.pack` files to be opened, we need to release the open
handles before calling `git fetch`: the latter process might want to
spawn an auto-gc, which in turn might want to repack the objects.

This commit is similar in spirit to 5bdece0d70 (gc/repack: release
packs when needed, 2018-12-15).

This fixes https://github.com/git-for-windows/git/issues/3336.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 12:17:15 -07:00
957ba814bf commit-graph: when closing the graph, also release the slab
The slab has information about the commit graph. That means that it is
meaningless (and even misleading) when the commit graph was closed.

This seems not to matter currently, but we're about to fix a
Windows-specific bug where `git pull` does not close the object store
before fetching (risking that an implicit auto-gc fails to remove the
now-obsolete pack file(s)), and once we have that bug fix in place, it
does matter: after that bug fix, we will open the object store, do some
stuff with it, then close it, fetch, and then open it again, and do more
stuff. If we close the commit graph without releasing the corresponding
slab, we're hit by a symptom like this in t5520.19:

	BUG: commit-reach.c:85: bad generation skip 9223372036854775807
	> 3 at 5cd378271655d43a3b4477520014f02213ad1546

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 12:17:14 -07:00
18a2f66d8a t7814: show lack of alternate ODB-adding
The previous patches have made "git grep" no longer need to add
submodule ODBs as alternates, at least for the code paths tested in
t7814. Demonstrate this by making adding a submodule ODB as an alternate
fatal in this test.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:48:13 -07:00
e3e8bf046e submodule-config: pass repo upon blob config read
When reading the config of a submodule, if reading from a blob, read
using an explicitly specified repository instead of by adding the
submodule's ODB as an alternate and then reading an object from
the_repository.

This makes the "grep --recurse-submodules with submodules without
.gitmodules in the working tree" test in t7814 work when
GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB is true.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:48:09 -07:00
0693806bf8 grep: add repository to OID grep sources
Record the repository whenever an OID grep source is created, and teach
the worker threads to explicitly provide the repository when accessing
objects.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:48:05 -07:00
dd45471a37 grep: allocate subrepos on heap
Currently, struct repository objects corresponding to submodules are
allocated on the stack in grep_submodule(). This currently works because
they will not be used once grep_submodule() exits, but a subsequent
patch will require these structs to be accessible for longer (perhaps
even in another thread). Allocate them on the heap and clear them only
at the very end.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:48:02 -07:00
78ca584f1c grep: read submodule entry with explicit repo
Replace an existing parse_object_or_die() call (which implicitly works
on the_repository) with a function call that allows a repository to be
passed in. There is no such direct equivalent to parse_object_or_die(),
but we only need the type of the object, so replace with
oid_object_info().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:47:59 -07:00
50d92b5f03 grep: typesafe versions of grep_source_init
grep_source_init() can create "struct grep_source" objects and,
depending on the value of the type passed, some void-pointer parameters have
different meanings. Because one of these types (GREP_SOURCE_OID) will
require an additional parameter in a subsequent patch, take the
opportunity to increase clarity and type safety by replacing this
function with individual functions for each type.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:47:55 -07:00
8d33c3af0b grep: use submodule-ODB-as-alternate lazy-addition
In the parent commit, Git was taught to add submodule ODBs as alternates
lazily, but grep does not use this because it computes the path to add
directly, not going through add_submodule_odb(). Add an equivalent to
add_submodule_odb() that takes the exact ODB path and teach grep to use
it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:47:49 -07:00
a35e03dee0 submodule: lazily add submodule ODBs as alternates
Teach Git to add submodule ODBs as alternates to the object store of
the_repository only upon the first access of an object not in
the_repository, and not when add_submodule_odb() is called.

This provides a means of gradually migrating from accessing a
submodule's object through alternates to accessing a submodule's object
by explicitly passing its repository object. Any Git command can declare
that it might access submodule objects by calling add_submodule_odb()
(as they do now), but the submodule ODBs themselves will not be added
until needed, so individual commands and/or combinations of arguments
can be migrated one by one.

[The advantage of explicit repository-object passing is code clarity (it
is clear which repository an object read is from), performance (there is
no need to linearly search through all submodule ODBs whenever an object
is accessed from any repository, whether superproject or submodule), and
the possibility of future features like partial clone submodules (which
right now is not possible because if an object is missing, we do not
know which repository to lazy-fetch into).]

This commit also introduces an environment variable that a test may set
to make the actual registration of alternates fatal, in order to
demonstrate that its codepaths do not need this registration.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:47:36 -07:00
e8ffd034c8 read-cache: fix GIT_TEST_SPLIT_INDEX
Running tests with GIT_TEST_SPLIT_INDEX=1 is supposed to turn on the
split index feature and trigger index splitting (mostly) randomly.
Alas, this has been broken since 6e37c8ed3c (read-cache.c: fix writing
"link" index ext with null base oid, 2019-02-13), and
GIT_TEST_SPLIT_INDEX=1 hasn't triggered any index splitting since
then.

This patch makes GIT_TEST_SPLIT_INDEX work again, though it doesn't
restore the pre-6e37c8ed3c behavior.  To understand the bug, the fix,
and the behavior change we first have to look at how
GIT_TEST_SPLIT_INDEX used to work before 6e37c8ed3c:

  There are two places where we check the value of
  GIT_TEST_SPLIT_INDEX, and before 6e37c8ed3c they worked like this:

    1) In the lower-level do_write_index(), where, if
       GIT_TEST_SPLIT_INDEX is enabled, we call init_split_index().
       This call merely allocates and zero-initializes
       'istate->split_index', but does nothing else (i.e. doesn't fill
       the base/shared index with cache entries, doesn't actually
       write a shared index file, etc.).  Pertinent to this issue, the
       hash of the base index remains all zeroed out.

    2) In the higher-level write_locked_index(), but only when
       'istate->split_index' has already been initialized.  Then, if
       GIT_TEST_SPLIT_INDEX is enabled, it randomly sets the flag that
       triggers index splitting later in this function.  This
       randomness comes from the first byte of the hash of the base
       index via an 'if ((first_byte & 15) < 6)' condition.

       However, if 'istate->split_index' hasn't been initialized (i.e.
       it is still NULL), then write_locked_index() just calls
       do_write_locked_index(), which internally calls the above
       mentioned do_write_index().

  This means that while GIT_TEST_SPLIT_INDEX=1 usually triggered index
  splitting randomly, the first two index writes were always
  deterministic (though I suspect this was unintentional):

    - The initial index write never splits the index.
      During the first index write write_locked_index() is called with
      'istate->split_index' still uninitialized, so the check in 2) is
      not executed.  It still calls do_write_index(), though, which
      then executes the check in 1).  The resulting all zero base
      index hash then leads to the 'link' extension being written to
      '.git/index', though a shared index file is not written:

        $ rm .git/index
        $ GIT_TEST_SPLIT_INDEX=1 git update-index --add file
        $ test-tool dump-split-index .git/index
        own c6ef71168597caec8553c83d9d0048f1ef416170
        base 0000000000000000000000000000000000000000
        100644 d00491fd7e5bb6fa28c517a0bb32b8b506539d4d 0 file
        replacements:
        deletions:
        $ ls -l .git/sharedindex.*
        ls: cannot access '.git/sharedindex.*': No such file or directory

    - The second index write always splits the index.
      When the index written in the previous point is read,
      'istate->split_index' is initialized because of the presence of
      the 'link' extension.  So during the second write
      write_locked_index() does run the check in 2), and the first
      byte of the all zero base index hash always fulfills the
      randomness condition, which in turn always triggers the index
      splitting.

    - Subsequent index writes will find the 'link' extension with a
      real non-zero base index hash, so from then on the check in 2)
      is executed and the first byte of the base index hash is as
      random as it gets (coming from the SHA-1 of index data including
      timestamps and inodes...).

All this worked until 6e37c8ed3c came along, and stopped writing the
'link' extension if the hash of the base index was all zero:

  $ rm .git/index
  $ GIT_TEST_SPLIT_INDEX=1 git update-index --add file
  $ test-tool dump-split-index .git/index
  own abbd6f6458d5dee73ae8e210ca15a68a390c6fd7
  not a split index
  $ ls -l .git/sharedindex.*
  ls: cannot access '.git/sharedindex.*': No such file or directory

So, since the first index write with GIT_TEST_SPLIT_INDEX=1 doesn't
write a 'link' extension, in the second index write
'istate->split_index' remains uninitialized, and the check in 2) is
not executed, and ultimately the index is never split.

Fix this by modifying write_locked_index() to make sure to check
GIT_TEST_SPLIT_INDEX even if 'istate->split_index' is still
uninitialized, and initialize it if necessary.  The check for
GIT_TEST_SPLIT_INDEX and separate init_split_index() call in
do_write_index() thus becomes unnecessary, so remove it.  Furthermore,
add a test to 't1700-split-index.sh' to make sure that
GIT_TEST_SPLIT_INDEX=1 will keep working (though only check the
index splitting on the first index write, because after that it will
be random).

Note that this change does not restore the pre-6e37c8ed3c behaviour,
as it will deterministically split the index already on the first
index write.  Since GIT_TEST_SPLIT_INDEX is purely a developer aid,
there is no backwards compatibility issue here.  The new behaviour did
trigger test failures in 't0003-attributes.sh' and 't1600-index.sh',
though, which have been fixed in preparatory patches in this series.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 23:28:04 -07:00
61feddcdf2 tests: disable GIT_TEST_SPLIT_INDEX for sparse index tests
The sparse index and split index features are said to be currently
incompatible [1], and consequently GIT_TEST_SPLIT_INDEX=1 might
interfere with the test cases exercising the sparse index feature.
Therefore GIT_TEST_SPLIT_INDEX is already explicitly disabled for the
whole of 't1092-sparse-checkout-compatibility.sh'.  There are,
however, two other test cases exercising sparse index, namely
'sparse-index enabled and disabled' in
't1091-sparse-checkout-builtin.sh' and 'status succeeds with sparse
index' in 't7519-status-fsmonitor.sh', and these two could fail with
GIT_TEST_SPLIT_INDEX=1 as well [2].

Unset GIT_TEST_SPLIT_INDEX and disable the split index in these two
test cases to avoid such interference.

Note that this is the minimal change to merely avoid failures when
these test cases are run with GIT_TEST_SPLIT_INDEX=1.  Interestingly,
though, without these changes the 'git sparse-checkout init --cone
--sparse-index' commands still succeed even with split index, and set
all the necessary configuration variables and create the initial
'$GIT_DIR/info/sparse-checkout' file, but the test failures are caused
by later sanity checks finding that the index is not in fact a sparse
index.  This indicates that 'git sparse-checkout init --sparse-index'
lacks some error checking and its tests lack coverage related to split
index, but fixing those issues is beyond the scope of this patch
series.

[1] https://public-inbox.org/git/48e9c3d6-407a-1843-2d91-22112410e3f8@gmail.com/

[2] Neither of these test cases fail at the moment, because
    GIT_TEST_SPLIT_INDEX=1 is broken and never splits the index, and
    it broke long before the sparse index feature was added.
    This patch series is about to fix GIT_TEST_SPLIT_INDEX, and then
    both test cases mentioned above would fail.

(The diff is best viewed with '--ignore-all-space')

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 23:28:02 -07:00
998330ac2e read-cache: look for shared index files next to the index, too
When reading a split index git always looks for its referenced shared
base index in the gitdir of the current repository, even when reading
an alternate index specified via GIT_INDEX_FILE, and even when that
alternate index file is the "main" '.git/index' file of an other
repository.  However, if that split index and its referenced shared
index files were written by a git command running entirely in that
other repository, then, naturally, the shared index file is written to
that other repository's gitdir.  Consequently, a git command
attempting to read that shared index file while running in a different
repository won't be able find it and will error out.

I'm not sure in what use case it is necessary to read the index of one
repository by a git command running in a different repository, but it
is certainly possible to do so, and in fact the test 'bare repository:
check that --cached honors index' in 't0003-attributes.sh' does
exactly that.  If GIT_TEST_SPLIT_INDEX=1 were to split the index in
just the right moment [1], then this test would indeed fail, because
the referenced shared index file could not be found.

Let's look for the referenced shared index file not only in the gitdir
of the current directory, but, if the shared index is not there, right
next to the split index as well.

[1] We haven't seen this issue trigger a failure in t0003 yet,
    because:

      - While GIT_TEST_SPLIT_INDEX=1 is supposed to trigger index
        splitting randomly, the first index write has always been
        deterministic and it has never split the index.

      - That alternate index file in the other repository is written
        only once in the entire test script, so it's never split.

    However, the next patch will fix GIT_TEST_SPLIT_INDEX, and while
    doing so it will slightly change its behavior to always split the
    index already on the first index write, and t0003 would always
    fail without this patch.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 23:23:47 -07:00
daad41c96d t1600-index: disable GIT_TEST_SPLIT_INDEX
Tests in 't1600-index.sh' check that various bogus index version
values are recognized and an appropriate warning message is issued.
GIT_TEST_SPLIT_INDEX=1 is supposed to trigger index splitting
randomly, and thus might interfere [1] with these tests: splitting the
index means that two index files are written (the shared base index
and the split '.git/index'), and the same warning message is then
issued twice, failing these tests.

Unset GIT_TEST_SPLIT_INDEX in this test script to avoid such
interference.

[1] There is no such interference at the moment, because, alas,
    GIT_TEST_SPLIT_INDEX=1 is broken and never split the index.  There
    was no such interference in the past (before it broke) either,
    because the first index write with GIT_TEST_SPLIT_INDEX=1 never
    split the index, only the second write did.  A subsequent commit
    fixing GIT_TEST_SPLIT_INDEX will have all the details on this.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 23:23:47 -07:00
bdfe1f0d69 t1600-index: don't run git commands upstream of a pipe
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 23:23:47 -07:00
c344256811 t1600-index: remove unnecessary redirection
In a helper function in the 't1600-index.sh' test script the stderr
of a 'git add' command is redirected to its stdout, but its stdout is
not redirected anywhere.  So apparently this redirection is
unnecessary, remove it.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 23:23:47 -07:00
22d18a9b15 Merge branch 'ds/sparse-index-ignored-files' into sg/test-split-index-fix
* ds/sparse-index-ignored-files:
  sparse-checkout: clear tracked sparse dirs
  sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
  attr: be careful about sparse directories
  sparse-checkout: create helper methods
  sparse-index: use WRITE_TREE_MISSING_OK
  sparse-index: silently return when cache tree fails
  unpack-trees: fix nested sparse-dir search
  sparse-index: silently return when not using cone-mode patterns
  t7519: rewrite sparse index test
2021-09-07 23:23:22 -07:00
55dfcf9591 sparse-checkout: clear tracked sparse dirs
When changing the scope of a sparse-checkout using cone mode, we might
have some tracked directories go out of scope. The current logic removes
the tracked files from within those directories, but leaves the ignored
files within those directories. This is a bit unexpected to users who
have given input to Git saying they don't need those directories
anymore.

This is something that is new to the cone mode pattern type: the user
has explicitly said "I want these directories and _not_ those
directories." The typical sparse-checkout patterns more generally apply
to "I want files with with these patterns" so it is natural to leave
ignored files as they are. This focus on directories in cone mode
provides us an opportunity to change the behavior.

Leaving these ignored files in the sparse directories makes it
impossible to gain performance benefits in the sparse index. When we
track into these directories, we need to know if the files are ignored
or not, which might depend on the _tracked_ .gitignore file(s) within
the sparse directory. This depends on the indexed version of the file,
so the sparse directory must be expanded.

We must take special care to look for untracked, non-ignored files in
these directories before deleting them. We do not want to delete any
meaningful work that the users were doing in those directories and
perhaps forgot to add and commit before switching sparse-checkout
definitions. Since those untracked files might be code files that
generated ignored build output, also do not delete any ignored files
from these directories in that case. The users can recover their state
by resetting their sparse-checkout definition to include that directory
and continue. Alternatively, they can see the warning that is presented
and delete the directory themselves to regain the performance they
expect.

By deleting the sparse directories when changing scope (or running 'git
sparse-checkout reapply') we regain these performance benefits as if the
repository was in a clean state.

Since these ignored files are frequently build output or helper files
from IDEs, the users should not need the files now that the tracked
files are removed. If the tracked files reappear, then they will have
newer timestamps than the build artifacts, so the artifacts will need to
be regenerated anyway.

Use the sparse-index as a data structure in order to find the sparse
directories that can be safely deleted. Re-expand the index to a full
one if it was full before.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:10 -07:00
ce7a9f0141 sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
The convert_to_sparse() method checks for the GIT_TEST_SPARSE_INDEX
environment variable or the "index.sparse" config setting before
converting the index to a sparse one. This is for ease of use since all
current consumers are preparing to compress the index before writing it
to disk. If these settings are not enabled, then convert_to_sparse()
silently returns without doing anything.

We will add a consumer in the next change that wants to use the sparse
index as an in-memory data structure, regardless of whether the on-disk
format should be sparse.

To that end, create the SPARSE_INDEX_MEMORY_ONLY flag that will skip
these config checks when enabled. All current consumers are modified to
pass '0' in the new 'flags' parameter.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:10 -07:00
77efbb366a attr: be careful about sparse directories
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:10 -07:00
02155c8c00 sparse-checkout: create helper methods
As we integrate the sparse index into more builtins, we occasionally
need to check the sparse-checkout patterns to see if a path is within
the sparse-checkout cone. Create some helper methods that help
initialize the patterns and check for pattern matching to make this
easier.

The existing callers of commands like get_sparse_checkout_patterns() use
a custom 'struct pattern_list' that is not necessarily the one in the
'struct index_state', so there are not many previous uses that could
adopt these helpers. There are just two in builtin/add.c and
sparse-index.c that can use path_in_sparse_checkout().

We add a path_in_cone_mode_sparse_checkout() as well that will only
return false if the path is outside of the sparse-checkout definition
_and_ the sparse-checkout patterns are in cone mode.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:10 -07:00
8a96b9d0a7 sparse-index: use WRITE_TREE_MISSING_OK
When updating the cache tree in convert_to_sparse(), the
WRITE_TREE_MISSING_OK flag indicates that trees might be computed that
do not already exist within the object database. This happens in cases
such as 'git add' creating new trees that it wants to store in
anticipation of a following 'git commit'. If this flag is not specified,
then it might trigger a promisor fetch or a failure due to the object
not existing locally.

Use WRITE_TREE_MISSING_OK during convert_to_sparse() to avoid these
possible reasons for the cache_tree_update() to fail.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
5dc16756b2 sparse-index: silently return when cache tree fails
If cache_tree_update() returns a non-zero value, then it could not
create the cache tree. This is likely due to a path having a merge
conflict. Since we are already returning early, let's return silently to
avoid making it seem like we failed to write the index at all.

If we remove our dependence on the cache tree within
convert_to_sparse(), then we could still recover from this scenario and
have a sparse index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
72d84ea347 unpack-trees: fix nested sparse-dir search
The iterated search in find_cache_entry() was recently modified to
include a loop that searches backwards for a sparse directory entry that
matches the given traverse_info and name_entry. However, the string
comparison failed to actually concatenate those two strings, so this
failed to find a sparse directory when it was not a top-level directory.

This caused some errors in rare cases where a 'git checkout' spanned a
diff that modified files within the sparse directory entry, but we could
not correctly find the entry.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
e27eab45c7 sparse-index: silently return when not using cone-mode patterns
While the sparse-index is only enabled when core.sparseCheckoutCone is
also enabled, it is possible for the user to modify the sparse-checkout
file manually in a way that does not match cone-mode patterns. In this
case, we should refuse to convert an index into a sparse index, since
the sparse_checkout_patterns will not be initialized with recursive and
parent path hashsets.

Also silently return if there are no cache entries, which is a simple
case: there are no paths to make sparse!

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
522d3cec00 t7519: rewrite sparse index test
The sparse index is tested with the FS Monitor hook and extension since
f8fe49e (fsmonitor: integrate with sparse index, 2021-07-14). This test
was very fragile because it shared an index across sparse and non-sparse
behavior. Since that expansion and contraction could cause the index to
lose its FS Monitor bitmap and token, behavior is fragile to changes in
'git sparse-checkout set'.

Rewrite the test to use two clones of the original repo: full and
sparse. This allows us to also keep the test files (actual, expect,
trace2.txt) out of the repos we are testing with 'git status'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
8fe8bae9d2 pack-write: skip *.rev work when not writing *.rev
Fix a performance regression introduced in a587b5a786 (pack-write.c:
extract 'write_rev_file_order', 2021-03-30) and stop needlessly
allocating the "pack_order" array and sorting it with
"pack_order_cmp()", only to throw that work away when we discover that
we're not writing *.rev files after all.

This redundant work was not present in the original version of this
code added in 8ef50d9958 (pack-write.c: prepare to write 'pack-*.rev'
files, 2021-01-25). There we'd call write_rev_file() from
e.g. finish_tmp_packfile(), but we'd "return NULL" early in
write_rev_file() if not doing a "WRITE_REV" or "WRITE_REV_VERIFY".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:04:03 -07:00
17919c3585 sequencer: restrict scope of a formerly public function
The function to add the `exec` commands to the todo list only needed to
be public API because it was not only used internally by the sequencer,
but also by `git rebase --preserve-merges`.

Now that that mode has been removed, we no longer need that function to
be scoped publicly.

Helped-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:33 -07:00
06aa5e4ea4 rebase: remove a no-longer-used function
With the `--preserve-merges` option going away, we no longer need this
function.

Helped-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:33 -07:00
82db1f8439 rebase: stop mentioning the -p option in comments
We no longer support `--preserve-merges`, therefore it does not make
sense to keep mentioning that option, even in code comments.

Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:33 -07:00
ff8d6e5a66 rebase: remove obsolete code comment
Now that we no longer have a `--preserve-merges` backend, this comment
needs to be adjusted.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:33 -07:00
5b55b32bd2 rebase: drop the internal rebase--interactive command
It was only used by the `--preserve-merges` backend, which we just
removed.

Helped-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:33 -07:00
0a159d65d6 git-svn: drop support for --preserve-merges
We already passed the `--rebase-merges` option to `git rebase` instead,
now we make this move permanent.

As pointed out by Ævar Arnfjörð Bjarmason, in contrast to the
deprecation of `git rebase`'s `--preserve-merges` backend, `git svn`
only deprecated this option in v2.25.0 (because this developer missed
`git svn`'s use of that backend when deprecating the rebase backend
running up to Git v2.22).

Still, v2.25.0 has been released on January 13th, 2020. In other words,
`git svn` deprecated this option over one and a half years ago, _and_
has been redirecting to the `--rebase-merges` option during all that
time (read: `git svn rebase --preserve-merges` didn't do _precisely_
what the user asked, since v2.25.0, anyway, it fell back to pretending
that the user asked for `git svn rebase --rebase-merges` instead).

It is time to act on that deprecation and remove that option after all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:33 -07:00
a74b35081c rebase: drop support for --preserve-merges
This option was deprecated in favor of `--rebase-merges` some time ago,
and now we retire it.

To assist users to transition away, we do not _actually_ remove the
option, but now we no longer implement the functionality. Instead, we
offer a helpful error message suggesting which option to use.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:33 -07:00
52f1e82178 pull: remove support for --rebase=preserve
In preparation for `git-rebase--preserve-merges.sh` entering its after
life, we remove this (deprecated) option that would still rely on it.

To help users transition who still did not receive the memo about the
deprecation, we offer a helpful error message instead of throwing our
hands in the air and saying that we don't know that option, never heard
of it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:32 -07:00
aa4df107e7 tests: stop testing git rebase --preserve-merges
This backend has been deprecated in favor of `git rebase
--rebase-merges`.

In preparation for dropping it, let's remove all the regression tests
that would need it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:32 -07:00
ab7c7c219b remote: warn about unhandled branch.<name>.rebase values
We ignore them silently, but it actually makes sense to warn the users
that their config setting has no effect.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:32 -07:00
6df8755c7b t5520: do not use pull.rebase=preserve
Even if those tests try to override that setting, let's not use it
because it is deprecated: let's use `merges` instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 21:45:32 -07:00
92a5d1c9b4 hash-object: prefix_filename() returns allocated memory these days
Back when a1be47e4 (hash-object: fix buffer reuse with --path in a
subdirectory, 2017-03-20) was written, the prefix_filename() helper
used a static piece of memory to the caller, making the caller
responsible for copying it, if it wants to keep it across another
call to the same function.  Two callers of the prefix_filename() in
hash-object were made to xstrdup() the value obtained from it.

But in the same series, when e4da43b1 (prefix_filename: return newly
allocated string, 2017-03-20) changed the rule to gave the caller
possession of the memory, we forgot to revert one of the xstrdup()
changes, allowing the returned value to leak.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 15:18:59 -07:00
ca0cc98e03 Documentation: fix default directory of git bugreport -o
git bugreport writes bug report to the current directory by default,
instead of repository root.

Fix the documentation.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 14:25:13 -07:00
72b113e562 Makefile: remove the check_bindir script
This script was added in f28ac70f48 (Move all dashed-form commands to
libexecdir, 2007-11-28) when commands such as "git-add" lived in the
bin directory, instead of the git exec directory.

This notice helped someone incorrectly installing version v1.6.0 and
later into a directory built for a pre-v1.6.0 git version.

We're now long past the point where anyone who'd be helped by this
warning is likely to be doing that, so let's just remove this check
and warning to simplify the Makefile.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 13:27:40 -07:00
b996f84989 send-email: fix a "first config key wins" regression in v2.33.0
Fix a regression in my c95e3a3f0b (send-email: move trivial config
handling to Perl, 2021-05-28) where we'd pick the first config key out
of multiple defined ones, instead of using the normal "last key wins"
semantics of "git config --get".

This broke e.g. cases where a .git/config would have a different
sendemail.smtpServer than ~/.gitconfig. We'd pick the ~/.gitconfig
over .git/config, instead of preferring the repository-local
version. The same would go for /etc/gitconfig etc.

The full list of impacted config keys (the %config_settings values
which are references to scalars, not arrays) is:

    sendemail.smtpencryption
    sendemail.smtpserver
    sendemail.smtpserverport
    sendemail.smtpuser
    sendemail.smtppass
    sendemail.smtpdomain
    sendemail.smtpauth
    sendemail.smtpbatchsize
    sendemail.smtprelogindelay
    sendemail.tocmd
    sendemail.cccmd
    sendemail.aliasfiletype
    sendemail.envelopesender
    sendemail.confirm
    sendemail.from
    sendemail.assume8bitencoding
    sendemail.composeencoding
    sendemail.transferencoding
    sendemail.sendmailcmd

I.e. having any of these set in say ~/.gitconfig and in-repo
.git/config regressed in v2.33.0 to prefer the --global one over the
--local.

To test this add a test of config priority to one of these config
variables, most don't have tests at all, but there was an existing one
for sendemail.8bitEncoding.

The "git config" (instead of "test_config") is somewhat of an
anti-pattern, but follows established conventions in
t9001-send-email.sh, likewise with any other pattern or idiom in this
test.

The populating of home/.gitconfig and setting of HOME= is copied from
a test in t0017-env-helper.sh added in 1ff750b128 (tests: make
GIT_TEST_GETTEXT_POISON a boolean, 2019-06-21). This test fails
without this bugfix, but now it works.

Reported-by: Eli Schwartz <eschwartz@archlinux.org>
Tested-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 13:18:55 -07:00
709b3f32d3 range-diff: avoid segfault with -I
output() reuses the same struct diff_options for multiple calls of
diff_flush().  Set the option no_free to instruct it to keep the
ignore regexes between calls and release them explicitly at the end.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 13:03:13 -07:00
5acffd3473 diff-index: restore -c/--cc options handling
This fixes 19b2517f (diff-merges: move specific diff-index "-m"
handling to diff-index, 2021-05-21).

That commit disabled handling of all diff for merges options in
diff-index on an assumption that they are unused. However, it later
appeared that -c and --cc, even though undocumented and not being
covered by tests, happen to have had particular effect on diff-index
output.

Restore original -c/--cc options handling by diff-index.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:11:35 -07:00
2d3491b117 tr2: log N parent process names on Linux
In 2f732bf15e (tr2: log parent process name, 2021-07-21) we started
logging parent process names, but only logged all parents on Windows.
on Linux only the name of the immediate parent process was logged.

Extend the functionality added there to also log full parent chain on
Linux.

This requires us to lookup "/proc/<getppid()>/stat" instead of
"/proc/<getppid()>/comm". The "comm" file just contains the name of the
process, but the "stat" file has both that information, and the parent
PID of that process, see procfs(5). We parse out the parent PID of our
own parent, and recursively walk the chain of "/proc/*/stat" files all
the way up the chain. A parent PID of 0 indicates the end of the
chain.

It's possible given the semantics of Linux's PID files that we end up
getting an entirely nonsensical chain of processes. It could happen if
e.g. we have a chain of processes like:

    1 (init) => 321 (bash) => 123 (git)

Let's assume that "bash" was started a while ago, and that as shown
the OS has already cycled back to using a lower PID for us than our
parent process. In the time it takes us to start up and get to
trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent
process might exit, and be replaced by an entirely different process!

We'd racily look up our own getppid(), but in the meantime our parent
would exit, and Linux would have cycled all the way back to starting
an entirely unrelated process as PID 321.

If that happens we'll just silently log incorrect data in our ancestry
chain. Luckily we don't need to worry about this except in this
specific cycling scenario, as Linux does not have PID
randomization. It appears it once did through a third-party feature,
but that it was removed around 2006[1]. For anyone worried about this
edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate
it, but not eliminate it.

One thing we don't need to worry about is getting into an infinite
loop when walking "/proc/*/stat". See 353d3d77f4 (trace2: collect
Windows-specific process information, 2019-02-22) for the related
Windows code that needs to deal with that, and [2] for an explanation
of that edge case.

Aside from potential race conditions it's also a bit painful to
correctly parse the process name out of "/proc/*/stat". A simpler
approach is to use fscanf(), see [3] for an implementation of that,
but as noted in the comment being added here it would fail in the face
of some weird process names, so we need our own parse_proc_stat() to
parse it out.

With this patch the "ancestry" chain for a trace2 event might look
like this:

    $ GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version | grep ancestry | jq -r .ancestry
    [
      "bash",
      "screen",
      "systemd"
    ]

And in the case of naughty process names like the following. This uses
perl's ability to use prctl(PR_SET_NAME, ...). See
Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on
assignment to $0 on Linux, 2010-04-15)[4]:

    $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry
    [
      "sh",
      "(naughty\nname)",
      "bash",
      "screen",
      "systemd"
    ]

1. https://grsecurity.net/news#grsec2110
2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/
3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/
4. 7636ea95c5

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:08:00 -07:00
326460a870 tr2: do compiler enum check in trace2_collect_process_info()
Change code added in 2f732bf15e (tr2: log parent process name,
2021-07-21) to use a switch statement without a "default" branch to
have the compiler error if this code ever drifts out of sync with the
members of the "enum trace2_process_info_reason".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
6eccfc3adf tr2: leave the parent list empty upon failure & don't leak memory
In a subsequent commit I'll be replacing most of this code to log N
parents, but let's first fix bugs introduced in the recent
2f732bf15e (tr2: log parent process name, 2021-07-21).

It was using the strbuf_read_file() in the wrong way, its return value
is either a length or a negative value on error. If we didn't have a
procfs, or otherwise couldn't access it we'd end up pushing an empty
string to the trace2 ancestry array.

It was also using the strvec_push() API the wrong way. That API always
does an xstrdup(), so by detaching the strbuf here we'd leak
memory. Let's instead pass in our pointer for strvec_push() to
xstrdup(), and then free our own strbuf. I do have some WIP changes to
make strvec_push_nodup() non-static, which makes this and some other
callsites nicer, but let's just follow the prevailing pattern of using
strvec_push() for now.

We'll also need to free that "procfs_path" strbuf whether or not
strbuf_read_file() succeeds, which was another source of memory leaks
in 2f732bf15e, i.e. we'd leak that memory as well if we weren't on a
system where we could read the file from procfs.

Let's move all the freeing of the memory to the end of the
function. If we're still at STRBUF_INIT with "name" due to not having
taken the branch where the strbuf_read_file() succeeds freeing it is
redundant. So we could move it into the body of the "if", but just
handling freeing the same way for all branches of the function makes
it more readable.

In combination with the preceding commit this makes all of
t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
48f68715b1 tr2: stop leaking "thread_name" memory
Fix a memory leak introduced in ee4512ed48 (trace2: create new
combined trace facility, 2019-02-22), we were doing a free() of other
memory allocated in tr2tls_create_self(), but not the "thread_name"
"struct strbuf".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
f2cc8881d7 tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
Rewrite a comment added in 2f732bf15e (tr2: log parent process name,
2021-07-21) to describe what we might do under
TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring
to "something extra".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
7d9c80f626 tr2: remove NEEDSWORK comment for "non-procfs" implementations
I'm fairly sure that there is no way on Linux to inspect the process
tree without using procfs, any tool such as ps(1), top(1) etc. that
shows this sort of information ultimately looks the information up in
procfs.

So let's remove this comment added in 2f732bf15e (tr2: log parent
process name, 2021-07-21), it's setting us up for an impossible task.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
d941cc4c34 bundle: show progress on "unbundle"
The "unbundle" command added in 2e0afafebd (Add git-bundle: move
objects and references by archive, 2007-02-22) did not show progress
output, even though the underlying API learned how to show progress in
be042aff24 (Teach progress eye-candy to fetch_refs_from_bundle(),
2011-09-18).

Now we'll show "Unbundling objects" using the new --progress-title
option to "git index-pack", to go with its existing "Receiving
objects" and "Indexing objects" (which it shows when invoked with
"--stdin", and with a pack file, respectively).

Unlike "git bundle create" we don't handle "--quiet" here, nor
"--all-progress" and "--all-progress-implied". Those are all specific
to "create" (and "verify", in the case of "--quiet").

The structure of the existing documentation is a bit unclear, e.g. the
documentation for the "--quiet" option added in
79862b6b77 (bundle-create: progress output control, 2019-11-10) only
describes how it works for "create", and not for "verify". That and
other issues in it should be fixed, but I'd like to avoid untangling
that mess right now. Let's just support the standard "--no-progress"
implicitly here, and leave cleaning up the general behavior of "git
bundle" for a later change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:59:23 -07:00
f46c46e4f2 index-pack: add --progress-title option
Add a --progress-title option to index-pack, when data is piped into
index-pack its progress is a proxy for whatever's feeding it data.

This option will allow us to set a more relevant progress bar title in
"git bundle unbundle", and is also used in my "bundle-uri" RFC
patches[1] by a new caller in fetch-pack.c.

The code change in cmd_index_pack() won't handle
"--progress-title=xyz", only "--progress-title xyz", and the "(i+1)"
style (as opposed to "i + 1") is a bit odd.

Not using the "--long-option=value" style is inconsistent with
existing long options handled by cmd_index_pack(), but makes the code
that needs to call it better (two strvec_push(), instead of needing a
strvec_pushf()). Since the option is internal-only the inconsistency
shouldn't matter.

I'm copying the pattern to handle it as-is from the handling of the
existing "-o" option in the same function, see 9cf6d3357a (Add
git-index-pack utility, 2005-10-12) for its addition. That's a short
option, but the code to implement the two is the same in functionality
and style. Eventually we'd like to migrate all of this this to
parse_options(), which would make these differences in behavior go
away.

1. https://lore.kernel.org/git/RFC-cover-00.13-0000000000-20210805T150534Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:59:23 -07:00
7366096de9 bundle API: change "flags" to be "extra_index_pack_args"
Since the "flags" parameter was added in be042aff24 (Teach progress
eye-candy to fetch_refs_from_bundle(), 2011-09-18) there's never been
more than the one flag: BUNDLE_VERBOSE.

Let's have the only caller who cares about that pass "-v" itself
instead through new "extra_index_pack_args" parameter. The flexibility
of being able to pass arbitrary arguments to "unbundle" will be used
in a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:59:23 -07:00
b681b191f9 maintenance: add support for systemd timers on Linux
The existing mechanism for scheduling background maintenance is done
through cron. On Linux systems managed by systemd, systemd provides an
alternative to schedule recurring tasks: systemd timers.

The main motivations to implement systemd timers in addition to cron
are:
* cron is optional and Linux systems running systemd might not have it
  installed.
* The execution of `crontab -l` can tell us if cron is installed but not
  if the daemon is actually running.
* With systemd, each service is run in its own cgroup and its logs are
  tagged by the service inside journald. With cron, all scheduled tasks
  are running in the cron daemon cgroup and all the logs of the
  user-scheduled tasks are pretended to belong to the system cron
  service.
  Concretely, a user that doesn’t have access to the system logs won’t
  have access to the log of their own tasks scheduled by cron whereas
  they will have access to the log of their own tasks scheduled by
  systemd timer.
  Although `cron` attempts to send email, that email may go unseen by
  the user because these days, local mailboxes are not heavily used
  anymore.

In order to schedule git maintenance, we need two unit template files:
* ~/.config/systemd/user/git-maintenance@.service
  to define the command to be started by systemd and
* ~/.config/systemd/user/git-maintenance@.timer
  to define the schedule at which the command should be run.

Those units are templates that are parameterized by the frequency.

Based on those templates, 3 timers are started:
* git-maintenance@hourly.timer
* git-maintenance@daily.timer
* git-maintenance@weekly.timer

The command launched by those three timers are the same as with the
other scheduling methods:

/path/to/git for-each-repo --exec-path=/path/to
--config=maintenance.repo maintenance run --schedule=%i

with the full path for git to ensure that the version of git launched
for the scheduled maintenance is the same as the one used to run
`maintenance start`.

The timer unit contains `Persistent=true` so that, if the computer is
powered down when a maintenance task should run, the task will be run
when the computer is back powered on.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:57:04 -07:00
eba1ba9d32 maintenance: git maintenance run learned --scheduler=<scheduler>
Depending on the system, different schedulers can be used to schedule
the hourly, daily and weekly executions of `git maintenance run`:
* `launchctl` for MacOS,
* `schtasks` for Windows and
* `crontab` for everything else.

`git maintenance run` now has an option to let the end-user explicitly
choose which scheduler he wants to use:
`--scheduler=auto|crontab|launchctl|schtasks`.

When `git maintenance start --scheduler=XXX` is run, it not only
registers `git maintenance run` tasks in the scheduler XXX, it also
removes the `git maintenance run` tasks from all the other schedulers to
ensure we cannot have two schedulers launching concurrent identical
tasks.

The default value is `auto` which chooses a suitable scheduler for the
system.

`git maintenance stop` doesn't have any `--scheduler` parameter because
this command will try to remove the `git maintenance run` tasks from all
the available schedulers.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:57:04 -07:00
cb7db5bbd5 cache.h: Introduce a generic "xdg_config_home_for(…)" function
Current implementation of `xdg_config_home(filename)` returns
`$XDG_CONFIG_HOME/git/$filename`, with the `git` subdirectory inserted
between the `XDG_CONFIG_HOME` environment variable and the parameter.

This patch introduces a `xdg_config_home_for(subdir, filename)` function
which is more generic. It only concatenates "$XDG_CONFIG_HOME", or
"$HOME/.config" if the former isn’t defined, with the parameters,
without adding `git` in between.

`xdg_config_home(filename)` is now implemented by calling
`xdg_config_home_for("git", filename)` but this new generic function can
be used to compute the configuration directory of other programs.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:57:04 -07:00
01c381037c test-lib-functions: keep user's debugger config files and TERM in 'debug'
The 'debug' function in test-lib-functions.sh is used to invoke a
debugger at a specific line in a test. It inherits the value of HOME and
TERM set by 'test-lib.sh': HOME="$TRASH_DIRECTORY" and TERM=dumb.

Changing the value of HOME means that any customization configured in a
developers' debugger configuration file (like $HOME/.gdbinit or
$HOME/.lldbinit) are not available in the debugger invoked by
'test_pause'.

Changing the value of TERM to 'dumb' means that colored output
is disabled in the debugger.

To make the debugging experience with 'debug' more pleasant, leverage
the variable USER_HOME, added in the previous commit, to copy a
developer's ~/.gdbinit and ~/.lldbinit to the test HOME. We do not set
HOME to USER_HOME as in 'test_pause' to avoid user configuration in
$USER_HOME/.gitconfig from interfering with the command being debugged.

Also, add a flag to launch the debugger with the original value of
TERM, and add the same warning as for 'test_pause'.

Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:53:39 -07:00
add5240fa5 test-lib-functions: optionally keep HOME, TERM and SHELL in 'test_pause'
The 'test_pause' function, which is designed to help interactive
debugging and exploration of tests, currently inherits the value of HOME
and TERM set by 'test-lib.sh': HOME="$TRASH_DIRECTORY" and TERM=dumb. It
also invokes the shell defined by TEST_SHELL_PATH, which defaults to
/bin/sh (through SHELL_PATH).

Changing the value of HOME means that any customization configured in a
developers' shell startup files and any Git aliases defined in their
global Git configuration file are not available in the shell invoked by
'test_pause'.

Changing the value of TERM to 'dumb' means that colored output
is disabled for all commands in that shell.

Using /bin/sh as the shell invoked by 'test_pause' is not ideal since
some platforms (i.e. Debian and derivatives) use Dash as /bin/sh, and
this shell is usually compiled without readline support, which makes for
a poor interactive command line experience.

To make the interactive command line experience in the shell invoked by
'test_pause' more pleasant, save the values of HOME and TERM in
USER_HOME and USER_TERM before changing them in test-lib.sh, and add
options to 'test_pause' to optionally use these variables to invoke the
shell. Also add an option to invoke SHELL instead of TEST_SHELL_PATH, so
that developer's interactive shell is used.

We use options instead of changing the behaviour unconditionally since
these three variables can slightly change command behaviour. Moreover,
using the original HOME means commands could overwrite files in a user's
home directory. Be explicit about these caveats in the new 'Usage'
section in test-lib-functions.sh.

Finally, add '[options]' to the test_pause synopsys in t/README, and
mention that the full list of helper functions and their options can be
found in test-lib-functions.sh.

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:53:39 -07:00
0aa496b6d5 test-lib-functions: use 'TEST_SHELL_PATH' in 'test_pause'
3f824e91c8 (t/Makefile: introduce TEST_SHELL_PATH, 2017-12-08)
made it easy to use a different shell for the tests than 'SHELL_PATH'
used at compile time. But 'test_pause' still invokes 'SHELL_PATH'.

If TEST_SHELL_PATH is set, invoke that shell in 'test_pause' for
consistency.

Suggested-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:53:39 -07:00
3231f41009 make: add INSTALL_STRIP option variable
Add $(INSTALL_STRIP), which allows passing stripping options to
$(INSTALL).

For this to work, installing executables must be split to installing
compiled binaries and scripts portions, since $(INSTALL_STRIP) is only
meaningful to the former.

Users can set this variable depending on their system. For example,
Linux users can use `-s --strip-program=strip`, while FreeBSD users can
simply set to `-s` and choose strip program with $STRIPBIN.

[original outline by Đoàn Trần Công Danh]

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-05 23:49:41 -07:00
57f183b698 apply: resolve trivial merge without hitting ll-merge with "--3way"
The ll_binary_merge() function assumes that the ancestor blob is
different from either side of the new versions, and always fails
the merge in conflict, unless -Xours or -Xtheirs is in effect.

The normal "merge" machineries all resolve the trivial cases
(e.g. if our side changed while their side did not, the result
is ours) without triggering the file-level merge drivers, so the
assumption is warranted.

The code path in "git apply --3way", however, does not check for
the trivial three-way merge situation and always calls the
file-level merge drivers.  This used to be perfectly OK back
when we always first attempted a straight patch application and
used the three-way code path only as a fallback.  Any binary
patch that can be applied as a trivial three-way merge (e.g. the
patch is based exactly on the version we happen to have) would
always cleanly apply, so the ll_binary_merge() that is not
prepared to see the trivial case would not have to handle such a
case.

This no longer is true after we made "--3way" to mean "first try
three-way and then fall back to straight application", and made
"git apply -3" on a binary patch that is based on the current
version no longer apply.

Teach "git apply -3" to first check for the trivial merge cases
and resolve them without hitting the file-level merge drivers.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
[jc: stolen tests from Jerry's patch]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-05 15:39:02 -07:00
e0a2f5cbc5 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03 13:49:30 -07:00
d868e4d9ee Merge branch 'sg/make-fix-ar-invocation'
Build fix.

* sg/make-fix-ar-invocation:
  Makefile: remove archives before manipulating them with 'ar'
2021-09-03 13:49:30 -07:00
4c3bddb64f Merge branch 'ti/tcsh-completion-regression-fix'
Update to the command line completion (in contrib/) for tcsh.

* ti/tcsh-completion-regression-fix:
  completion: tcsh: Fix regression by drop of wrapper functions
2021-09-03 13:49:30 -07:00
77b063cd35 Merge branch 'fc/completion-updates'
Command line completion updates.

* fc/completion-updates:
  completion: bash: add correct suffix in variables
  completion: bash: fix for multiple dash commands
  completion: bash: fix for suboptions with value
  completion: bash: fix prefix detection in branch.*
2021-09-03 13:49:29 -07:00
9135259b03 Merge branch 'pw/rebase-r-fixes'
Various bugs in "git rebase -r" have been fixed.

* pw/rebase-r-fixes:
  rebase -r: fix merge -c with a merge strategy
  rebase -r: don't write .git/MERGE_MSG when fast-forwarding
  rebase -i: add another reword test
  rebase -r: make 'merge -c' behave like reword
2021-09-03 13:49:29 -07:00
0ba5a0b3ba Merge branch 'pw/rebase-skip-final-fix'
Checking out all the paths from HEAD during the last conflicted
step in "git rebase" and continuing would cause the step to be
skipped (which is expected), but leaves MERGE_MSG file behind in
$GIT_DIR and confuses the next "git commit", which has been
corrected.

* pw/rebase-skip-final-fix:
  rebase --continue: remove .git/MERGE_MSG
  rebase --apply: restore some tests
  t3403: fix commit authorship
2021-09-03 13:49:28 -07:00
2ad8d49635 Merge branch 'cb/ci-use-upload-artifacts-v1'
Use upload-artifacts v1 (instead of v2) for 32-bit linux, as the
new version has a blocker bug for that architecture.

* cb/ci-use-upload-artifacts-v1:
  ci: use upload-artifacts v1 for dockerized jobs
2021-09-03 13:49:28 -07:00
6e21f716f8 Merge branch 'jk/commit-edit-fixup-fix'
"git commit --fixup" now works with "--edit" again, after it was
broken in v2.32.

* jk/commit-edit-fixup-fix:
  commit: restore --edit when combined with --fixup
2021-09-03 13:49:27 -07:00
a5619d4f8d Merge branch 'ps/connectivity-optim'
The revision traversal API has been optimized by taking advantage
of the commit-graph, when available, to determine if a commit is
reachable from any of the existing refs.

* ps/connectivity-optim:
  revision: avoid hitting packfiles when commits are in commit-graph
  commit-graph: split out function to search commit position
  revision: stop retrieving reference twice
  connected: do not sort input revisions
  revision: separate walk and unsorted flags
2021-09-03 13:49:27 -07:00
6a8cbc41ba developer: enable pedantic by default
With the codebase firmly C99 compatible and most compilers supporting
newer versions by default, it could help bring visibility to problems.

Reverse the DEVOPTS=pedantic flag to provide a fallback for people stuck
with gcc < 5 or some other compiler that either doesn't support this flag
or has issues with it, and while at it also enable -Wpedantic which used
to be controversial[1] when Apple compilers and clang had widely divergent
version numbers.

Ideally any compiler found to have issues with these flags will be added
to an exception, and indeed, one was added to safely process windows
headers that would use non standard print identifiers, but it is expected
that more will be needed, so it could be considered a weather balloon.

[1] https://lore.kernel.org/git/20181127100557.53891-1-carenas@gmail.com/

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03 11:40:30 -07:00
27e0c3c6cf win32: allow building with pedantic mode enabled
In preparation to building with pedantic mode enabled, change a couple
of places where the current mingw gcc compiler provided with the SDK
reports issues.

A full fix for the incompatible use of (void *) to store function
pointers has been punted, with the minimal change to instead use a
generic function pointer (FARPROC), and therefore the (hopefully)
temporary need to disable incompatible pointer warnings.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03 11:40:30 -07:00
153fb49e60 gettext: remove optional non-standard parens in N_() definition
Remove the USE_PARENS_AROUND_GETTEXT_N compile-time option which was
meant to catch an inadvertent mistake which is too obscure to
maintain this facility.

The backstory of how USE_PARENS_AROUND_GETTEXT_N came about is: When I
added the N_() macro in 6578483036 (i18n: add no-op _() and N_()
wrappers, 2011-02-22) it was defined as:

    #define N_(msgid) (msgid)

This is non-standard C, as was noticed and fixed in 642f85faab (i18n:
avoid parenthesized string as array initializer, 2011-04-07).
I.e. this needed to be defined as:

    #define N_(msgid) msgid

Then in e62cd35a3e (i18n: log: mark parseopt strings for translation,
2012-08-20) when "builtin_log_usage" was marked for translation the
string concatenation for passing to usage() added in 1c370ea4e5
(Show usage string for 'git log -h', 'git show -h' and 'git diff -h',
2009-08-06) was faithfully preserved:

-       "git log [<options>] [<since>..<until>] [[--] <path>...]\n"
-       "   or: git show [options] <object>...",
+       N_("git log [<options>] [<since>..<until>] [[--] <path>...]\n")
+       N_("   or: git show [options] <object>..."),

This was then fixed to be the expected array of usage strings in
e66dc0cc4b (log.c: fix translation markings, 2015-01-06) rather than
a string with multiple "\n"-delimited usage strings, and finally in
290c8e7a3f (gettext.h: add parentheses around N_ expansion if
supported, 2015-01-11) USE_PARENS_AROUND_GETTEXT_N was added to ensure
this mistake didn't happen again.

I think that even if this was a N_()-specific issue this
USE_PARENS_AROUND_GETTEXT_N facility wouldn't be worth it, the issue
would be too rare to worry about.

But I also think that 290c8e7a3f which introduced
USE_PARENS_AROUND_GETTEXT_N misattributed the problem. The issue
wasn't with the N_() macro added in e62cd35a3e, but that before the
N_() macro existed in the codebase the initial migration to
parse_options() in 1c370ea4e5 continued passsing in a "\n"-delimited
string, when the new API it was migrating to supported and expected
the passing of an array.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03 11:40:30 -07:00
efa3d64ce8 update-ref: fix streaming of status updates
When executing git-update-ref(1) with the `--stdin` flag, then the user
can queue updates and, since e48cf33b61 (update-ref: implement
interactive transaction handling, 2020-04-02), interactively drive the
transaction's state via a set of transactional verbs. This interactivity
is somewhat broken though: while the caller can use these verbs to drive
the transaction's state, the status messages which confirm that a verb
has been processed is not flushed. The caller may thus be left hanging
waiting for the acknowledgement.

Fix the bug by flushing stdout after writing the status update. Add a
test which catches this bug.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03 11:35:15 -07:00
6540b71614 remote: avoid -Wunused-but-set-variable in gcc with -DNDEBUG
In make_remote(), we store the return value of hashmap_put() and check
it using assert(), but don't otherwise use it. If Git is compiled with
NDEBUG, then the assert() becomes a noop, and nobody looks at the
variable at all. This causes some compilers to produce warnings.

Let's switch it instead to a BUG(). This accomplishes the same thing,
but is always compiled in (and we don't have to worry about the cost;
the check is cheap, and this is not a hot code path).

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-02 13:13:19 -07:00
b45c172e51 gc: remove trailing dot from "gc.log" line
Remove the trailing dot from the warning we emit about gc.log. It's
common for various terminal UX's to allow the user to select "words",
and by including the trailing dot a user wanting to select the path to
gc.log will need to manually remove the trailing dot.

Such a user would also probably need to adjust the path if it e.g. had
spaces in it, but this should address this very common case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Suggested-by: Jan Judas <snugar.i@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-02 11:22:32 -07:00
2d59597333 p5326: perf tests for MIDX bitmaps
These new performance tests demonstrate effectively the same behavior as
p5310, but use a multi-pack bitmap instead of a single-pack one.

Notably, p5326 does not create a MIDX bitmap with multiple packs. This
is so we can measure a direct comparison between it and p5310. Any
difference between the two is measuring just the overhead of using MIDX
bitmaps.

Here are the results of p5310 and p5326 together, measured at the same
time and on the same machine (using a Xenon W-2255 CPU):

    Test                                                  HEAD
    ------------------------------------------------------------------------
    5310.2: repack to disk                                96.78(93.39+11.33)
    5310.3: simulated clone                               9.98(9.79+0.19)
    5310.4: simulated fetch                               1.75(4.26+0.19)
    5310.5: pack to file (bitmap)                         28.20(27.87+8.70)
    5310.6: rev-list (commits)                            0.41(0.36+0.05)
    5310.7: rev-list (objects)                            1.61(1.54+0.07)
    5310.8: rev-list count with blob:none                 0.25(0.21+0.04)
    5310.9: rev-list count with blob:limit=1k             2.65(2.54+0.10)
    5310.10: rev-list count with tree:0                   0.23(0.19+0.04)
    5310.11: simulated partial clone                      4.34(4.21+0.12)
    5310.13: clone (partial bitmap)                       11.05(12.21+0.48)
    5310.14: pack to file (partial bitmap)                31.25(34.22+3.70)
    5310.15: rev-list with tree filter (partial bitmap)   0.26(0.22+0.04)

versus the same tests (this time using a multi-pack index):

    Test                                                  HEAD
    ------------------------------------------------------------------------
    5326.2: setup multi-pack index                        78.99(75.29+11.58)
    5326.3: simulated clone                               11.78(11.56+0.22)
    5326.4: simulated fetch                               1.70(4.49+0.13)
    5326.5: pack to file (bitmap)                         28.02(27.72+8.76)
    5326.6: rev-list (commits)                            0.42(0.36+0.06)
    5326.7: rev-list (objects)                            1.65(1.58+0.06)
    5326.8: rev-list count with blob:none                 0.26(0.21+0.05)
    5326.9: rev-list count with blob:limit=1k             2.97(2.86+0.10)
    5326.10: rev-list count with tree:0                   0.25(0.20+0.04)
    5326.11: simulated partial clone                      5.65(5.49+0.16)
    5326.13: clone (partial bitmap)                       12.22(13.43+0.38)
    5326.14: pack to file (partial bitmap)                30.05(31.57+7.25)
    5326.15: rev-list with tree filter (partial bitmap)   0.24(0.20+0.04)

There is slight overhead in "simulated clone", "simulated partial
clone", and "clone (partial bitmap)". Unsurprisingly, that overhead is
due to using the MIDX's reverse index to map between bit positions and
MIDX positions.

This can be reproduced by running "git repack -adb" along with "git
multi-pack-index write --bitmap" in a large-ish repository. Then run:

    $ perf record -o pack.perf git -c core.multiPackIndex=false \
      pack-objects --all --stdout >/dev/null </dev/null
    $ perf record -o midx.perf git -c core.multiPackIndex=true \
      pack-objects --all --stdout >/dev/null </dev/null

and compare the two with "perf diff -c delta -o 1 pack.perf midx.perf".
The most notable results are below (the next largest positive delta is
+0.14%):

    # Event 'cycles'
    #
    # Baseline    Delta  Shared Object       Symbol
    # ........  .......  ..................  ..........................
    #
                 +5.86%  git                 [.] nth_midxed_offset
                 +5.24%  git                 [.] nth_midxed_pack_int_id
         3.45%   +0.97%  git                 [.] offset_to_pack_pos
         3.30%   +0.57%  git                 [.] pack_pos_to_offset
                 +0.30%  git                 [.] pack_pos_to_midx

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
9387fbd646 p5310: extract full and partial bitmap tests
A new p5326 introduced by the next patch will want these same tests,
interjecting its own setup in between. Move them out so that both perf
tests can reuse them.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
ff1e653c8e midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
Introduce a new 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment
variable to also write a multi-pack bitmap when
'GIT_TEST_MULTI_PACK_INDEX' is set.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
4b58b6f7b7 t7700: update to work with MIDX bitmap test knob
A number of these tests are focused only on pack-based bitmaps and need
to be updated to disable 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' where
necessary.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
e255a5e81c t5319: don't write MIDX bitmaps in t5319
This test is specifically about generating a midx still respecting a
pack-based bitmap file. Generating a MIDX bitmap would confuse the test.
Let's override the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' variable to
make sure we don't do so.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
eb6e956e79 t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
Generating a MIDX bitmap confuses many of the tests in t5310, which
expect to control whether and how bitmaps are written. Since the
relevant MIDX-bitmap tests here are covered already in t5326, let's just
disable the flag for the whole t5310 script.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
d3f17e1723 t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
Generating a MIDX bitmap causes tests which repack in a partial clone to
fail because they are missing objects. Missing objects is an expected
component of tests in t0410, so disable this knob altogether. Graceful
degradation when writing a bitmap with missing objects is tested in
t5326.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
c51f5a6437 t5326: test multi-pack bitmap behavior
This patch introduces a new test, t5326, which tests the basic
functionality of multi-pack bitmaps.

Some trivial behavior is tested, such as:

  - Whether bitmaps can be generated with more than one pack.
  - Whether clones can be served with all objects in the bitmap.
  - Whether follow-up fetches can be served with some objects outside of
    the server's bitmap

These use lib-bitmap's tests (which in turn were pulled from t5310), and
we cover cases where the MIDX represents both a single pack and multiple
packs.

In addition, some non-trivial and MIDX-specific behavior is tested, too,
including:

  - Whether multi-pack bitmaps behave correctly with respect to the
    pack-reuse machinery when the base for some object is selected from
    a different pack than the delta.
  - Whether multi-pack bitmaps correctly respect the
    pack.preferBitmapTips configuration.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
b1b82d1c30 t/helper/test-read-midx.c: add --checksum mode
Subsequent tests will want to check for the existence of a multi-pack
bitmap which matches the multi-pack-index stored in the pack directory.

The multi-pack bitmap includes the hex checksum of the MIDX it
corresponds to in its filename (for example,
'$packdir/multi-pack-index-<checksum>.bitmap'). As a result, some tests
want a way to learn what '<checksum>' is.

This helper addresses that need by printing the checksum of the
repository's multi-pack-index.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
aeb4657242 t5310: move some tests to lib-bitmap.sh
We'll soon be adding a test script that will cover many of the same
bitmap concepts as t5310, but for MIDX bitmaps. Let's pull out as many
of the applicable tests as we can so we don't have to rewrite them.

There should be no functional change to t5310; we still run the same
operations in the same order.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
c528e17966 pack-bitmap: write multi-pack bitmaps
Write multi-pack bitmaps in the format described by
Documentation/technical/bitmap-format.txt, inferring their presence with
the absence of '--bitmap'.

To write a multi-pack bitmap, this patch attempts to reuse as much of
the existing machinery from pack-objects as possible. Specifically, the
MIDX code prepares a packing_data struct that pretends as if a single
packfile has been generated containing all of the objects contained
within the MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
0f533c7284 pack-bitmap: read multi-pack bitmaps
This prepares the code in pack-bitmap to interpret the new multi-pack
bitmaps described in Documentation/technical/bitmap-format.txt, which
mostly involves converting bit positions to accommodate looking them up
in a MIDX.

Note that there are currently no writers who write multi-pack bitmaps,
and that this will be implemented in the subsequent commit. Note also
that get_midx_checksum() and get_midx_filename() are made non-static so
they can be called from pack-bitmap.c.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
a5f9f24aa0 pack-bitmap.c: avoid redundant calls to try_partial_reuse
try_partial_reuse() is used to mark any bits in the beginning of a
bitmap whose objects can be reused verbatim from the pack they came
from.

Currently this function returns void, and signals nothing to the caller
when bits could not be reused. But multi-pack bitmaps would benefit from
having such a signal, because they may try to pass objects which are in
bounds, but from a pack other than the preferred one.

Any extra calls are noops because of a conditional in
reuse_partial_packfile_from_bitmap(), but those loop iterations can be
avoided by letting try_partial_reuse() indicate when it can't accept any
more bits for reuse, and then listening to that signal.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
711260fd60 pack-bitmap.c: introduce 'bitmap_is_preferred_refname()'
In a recent commit, pack-objects learned support for the
'pack.preferBitmapTips' configuration. This patch prepares the
multi-pack bitmap code to respect this configuration, too.

The yet-to-be implemented code will find that it is more efficient to
check whether each reference contains a prefix found in the configured
set of values rather than doing an additional traversal.

Implement a function 'bitmap_is_preferred_refname()' which will perform
that check. Its caller will be added in a subsequent patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
6b4277e697 pack-bitmap.c: introduce 'nth_bitmap_object_oid()'
A subsequent patch to support reading MIDX bitmaps will be less noisy
after extracting a generic function to fetch the nth OID contained in
the bitmap.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
ed184620f5 pack-bitmap.c: introduce 'bitmap_num_objects()'
A subsequent patch to support reading MIDX bitmaps will be less noisy
after extracting a generic function to return how many objects are
contained in a bitmap.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
f57a739691 midx: avoid opening multiple MIDXs when writing
Opening multiple instance of the same MIDX can lead to problems like two
separate packed_git structures which represent the same pack being added
to the repository's object store.

The above scenario can happen because prepare_midx_pack() checks if
`m->packs[pack_int_id]` is NULL in order to determine if a pack has been
opened and installed in the repository before. But a caller can
construct two copies of the same MIDX by calling get_multi_pack_index()
and load_multi_pack_index() since the former manipulates the
object store directly but the latter is a lower-level routine which
allocates a new MIDX for each call.

So if prepare_midx_pack() is called on multiple MIDXs with the same
pack_int_id, then that pack will be installed twice in the object
store's packed_git pointer.

This can lead to problems in, for e.g., the pack-bitmap code, which does
something like the following (in pack-bitmap.c:open_pack_bitmap()):

    struct bitmap_index *bitmap_git = ...;
    for (p = get_all_packs(r); p; p = p->next) {
      if (open_pack_bitmap_1(bitmap_git, p) == 0)
        ret = 0;
    }

which is a problem if two copies of the same pack exist in the
packed_git list because pack-bitmap.c:open_pack_bitmap_1() contains a
conditional like the following:

    if (bitmap_git->pack || bitmap_git->midx) {
      /* ignore extra bitmap file; we can only handle one */
      warning("ignoring extra bitmap file: %s", packfile->pack_name);
      close(fd);
      return -1;
    }

Avoid this scenario by not letting write_midx_internal() open a MIDX
that isn't also pointed at by the object store. So long as this is the
case, other routines should prefer to open MIDXs with
get_multi_pack_index() or reprepare_packed_git() instead of creating
instances on their own. Because get_multi_pack_index() returns
`r->object_store->multi_pack_index` if it is non-NULL, we'll only have
one instance of a MIDX open at one time, avoiding these problems.

To encourage this, drop the `struct multi_pack_index *` parameter from
`write_midx_internal()`, and rely instead on the `object_dir` to find
(or initialize) the correct MIDX instance.

Likewise, replace the call to `close_midx()` with
`close_object_store()`, since we're about to replace the MIDX with a new
one and should invalidate the object store's memory of any MIDX that
might have existed beforehand.

Note that this now forbids passing object directories that don't belong
to alternate repositories over `--object-dir`, since before we would
have happily opened a MIDX in any directory, but now restrict ourselves
to only those reachable by `r->objects->multi_pack_index` (and alternate
MIDXs that we can see by walking the `next` pointer).

As far as I can tell, supporting arbitrary directories with
`--object-dir` was a historical accident, since even the documentation
says `<alt>` when referring to the value passed to this option.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 13:56:43 -07:00
caff8b7340 fetch: avoid second connectivity check if we already have all objects
When fetching refs, we are doing two connectivity checks:

    - The first one is done such that we can skip fetching refs in the
      case where we already have all objects referenced by the updated
      set of refs.

    - The second one verifies that we have all objects after we have
      fetched objects.

We always execute both connectivity checks, but this is wasteful in case
the first connectivity check already notices that we have all objects
locally available.

Skip the second connectivity check in case we already had all objects
available. This gives us a nice speedup when doing a mirror-fetch in a
repository with about 2.3M refs where the fetching repo already has all
objects:

    Benchmark #1: HEAD~: git-fetch
      Time (mean ± σ):     30.025 s ±  0.081 s    [User: 27.070 s, System: 4.933 s]
      Range (min … max):   29.900 s … 30.111 s    5 runs

    Benchmark #2: HEAD: git-fetch
      Time (mean ± σ):     25.574 s ±  0.177 s    [User: 22.855 s, System: 4.683 s]
      Range (min … max):   25.399 s … 25.765 s    5 runs

    Summary
      'HEAD: git-fetch' ran
        1.17 ± 0.01 times faster than 'HEAD~: git-fetch'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 12:43:56 -07:00
1c7d1ab6f4 fetch: merge fetching and consuming refs
The functions `fetch_refs()` and `consume_refs()` must always be called
together such that we first obtain all missing objects and then update
our local refs to match the remote refs. In a subsequent patch, we'll
further require that `fetch_refs()` must always be called before
`consume_refs()` such that it can correctly assert that we have all
objects after the fetch given that we're about to move the connectivity
check.

Make this requirement explicit by merging both functions into a single
`fetch_and_consume_refs()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 12:43:56 -07:00
284b2ce8fc fetch: refactor fetch refs to be more extendable
Refactor `fetch_refs()` code to make it more extendable by explicitly
handling error cases. The refactored code should behave the same.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 12:43:56 -07:00
62b5a35a33 fetch-pack: optimize loading of refs via commit graph
In order to negotiate a packfile, we need to dereference refs to see
which commits we have in common with the remote. To do so, we first look
up the object's type -- if it's a tag, we peel until we hit a non-tag
object. If we hit a commit eventually, then we return that commit.

In case the object ID points to a commit directly, we can avoid the
initial lookup of the object type by opportunistically looking up the
commit via the commit-graph, if available, which gives us a slight speed
bump of about 2% in a huge repository with about 2.3M refs:

    Benchmark #1: HEAD~: git-fetch
      Time (mean ± σ):     31.634 s ±  0.258 s    [User: 28.400 s, System: 5.090 s]
      Range (min … max):   31.280 s … 31.896 s    5 runs

    Benchmark #2: HEAD: git-fetch
      Time (mean ± σ):     31.129 s ±  0.543 s    [User: 27.976 s, System: 5.056 s]
      Range (min … max):   30.172 s … 31.479 s    5 runs

    Summary
      'HEAD: git-fetch' ran
        1.02 ± 0.02 times faster than 'HEAD~: git-fetch'

In case this fails, we fall back to the old code which peels the
objects to a commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 12:43:56 -07:00
9fec7b2130 connected: refactor iterator to return next object ID directly
The object ID iterator used by the connectivity checks returns the next
object ID via an out-parameter and then uses a return code to indicate
whether an item was found. This is a bit roundabout: instead of a
separate error code, we can just return the next object ID directly and
use `NULL` pointers as indicator that the iterator got no items left.
Furthermore, this avoids a copy of the object ID.

Refactor the iterator and all its implementations to return object IDs
directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs:

    Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch
      Time (mean ± σ):     30.110 s ±  0.148 s    [User: 27.161 s, System: 5.075 s]
      Range (min … max):   29.934 s … 30.406 s    10 runs

    Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch
      Time (mean ± σ):     29.899 s ±  0.109 s    [User: 26.916 s, System: 5.104 s]
      Range (min … max):   29.696 s … 29.996 s    10 runs

    Summary
      '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran
        1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch'

While this 1% speedup could be labelled as statistically insignificant,
the speedup is consistent on my machine. Furthermore, this is an end to
end test, so it is expected that the improvement in the connectivity
check itself is more significant.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 12:43:56 -07:00
47c61004c7 fetch: avoid unpacking headers in object existence check
When updating local refs after the fetch has transferred all objects, we
do an object existence test as a safety guard to avoid updating a ref to
an object which we don't have. We do so via `oid_object_info()`: if it
returns an error, then we know the object does not exist.

One side effect of `oid_object_info()` is that it parses the object's
type, and to do so it must unpack the object header. This is completely
pointless: we don't care for the type, but only want to assert that the
object exists.

Refactor the code to use `repo_has_object_file()`, which both makes the
code's intent clearer and is also faster because it does not unpack
object headers. In a real-world repo with 2.3M refs, this results in a
small speedup when doing a mirror-fetch:

    Benchmark #1: HEAD~: git-fetch
      Time (mean ± σ):     33.686 s ±  0.176 s    [User: 30.119 s, System: 5.262 s]
      Range (min … max):   33.512 s … 33.944 s    5 runs

    Benchmark #2: HEAD: git-fetch
      Time (mean ± σ):     31.247 s ±  0.195 s    [User: 28.135 s, System: 5.066 s]
      Range (min … max):   30.948 s … 31.472 s    5 runs

    Summary
      'HEAD: git-fetch' ran
        1.08 ± 0.01 times faster than 'HEAD~: git-fetch'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 12:43:56 -07:00
fe7df03a9a fetch: speed up lookup of want refs via commit-graph
When updating our local refs based on the refs fetched from the remote,
we need to iterate through all requested refs and load their respective
commits such that we can determine whether they need to be appended to
FETCH_HEAD or not. In cases where we're fetching from a remote with
exceedingly many refs, resolving these refs can be quite expensive given
that we repeatedly need to unpack object headers for each of the
referenced objects.

Speed this up by opportunistically trying to resolve object IDs via the
commit graph. We only do so for any refs which are not in "refs/tags":
more likely than not, these are going to be a commit anyway, and this
lets us avoid having to unpack object headers completely in case the
object is a commit that is part of the commit-graph. This significantly
speeds up mirror-fetches in a real-world repository with
2.3M refs:

    Benchmark #1: HEAD~: git-fetch
      Time (mean ± σ):     56.482 s ±  0.384 s    [User: 53.340 s, System: 5.365 s]
      Range (min … max):   56.050 s … 57.045 s    5 runs

    Benchmark #2: HEAD: git-fetch
      Time (mean ± σ):     33.727 s ±  0.170 s    [User: 30.252 s, System: 5.194 s]
      Range (min … max):   33.452 s … 33.871 s    5 runs

    Summary
      'HEAD: git-fetch' ran
        1.67 ± 0.01 times faster than 'HEAD~: git-fetch'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 12:43:56 -07:00
9bb6c2e54f midx: close linked MIDXs, avoid leaking memory
When a repository has at least one alternate, the MIDX belonging to each
alternate is accessed through the `next` pointer on the main object
store's copy of the MIDX. close_midx() didn't bother to close any
of the linked MIDXs. It likewise didn't free the memory pointed to by
`m`, leaving uninitialized bytes with live pointers to them left around
in the heap.

Clean this up by closing linked MIDXs, and freeing up the memory pointed
to by each of them. When callers call close_midx(), then they can
discard the entire linked list of MIDXs and set their pointer to the
head of that list to NULL.

This isn't strictly required for the upcoming patches, but it makes it
much more difficult (though still possible, for e.g., by calling
`close_midx(m->next)` which leaves `m->next` pointing at uninitialized
bytes) to have pointers to uninitialized memory.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 10:58:43 -07:00
177c0d6e63 midx: infer preferred pack when not given one
In 9218c6a40c (midx: allow marking a pack as preferred, 2021-03-30), the
multi-pack index code learned how to select a pack which all duplicate
objects are selected from. That is, if an object appears in multiple
packs, select the copy in the preferred pack before breaking ties
according to the other rules like pack mtime and readdir() order.

Not specifying a preferred pack can cause serious problems with
multi-pack reachability bitmaps, because these bitmaps rely on having at
least one pack from which all duplicates are selected. Not having such a
pack causes problems with the code in pack-objects to reuse packs
verbatim (e.g., that code assumes that a delta object in a chunk of pack
sent verbatim will have its base object sent from the same pack).

So why does not marking a pack preferred cause problems here? The reason
is roughly as follows:

  - Ties are broken (when handling duplicate objects) by sorting
    according to midx_oid_compare(), which sorts objects by OID,
    preferred-ness, pack mtime, and finally pack ID (more on that
    later).

  - The psuedo pack-order (described in
    Documentation/technical/pack-format.txt under the section
    "multi-pack-index reverse indexes") is computed by
    midx_pack_order(), and sorts by pack ID and pack offset, with
    preferred packs sorting first.

  - But! Pack IDs come from incrementing the pack count in
    add_pack_to_midx(), which is a callback to
    for_each_file_in_pack_dir(), meaning that pack IDs are assigned in
    readdir() order.

When specifying a preferred pack, all of that works fine, because
duplicate objects are correctly resolved in favor of the copy in the
preferred pack, and the preferred pack sorts first in the object order.

"Sorting first" is critical, because the bitmap code relies on finding
out which pack holds the first object in the MIDX's pseudo pack-order to
determine which pack is preferred.

But if we didn't specify a preferred pack, and the pack which comes
first in readdir() order does not also have the lowest timestamp, then
it's possible that that pack (the one that sorts first in pseudo-pack
order, which the bitmap code will treat as the preferred one) did *not*
have all duplicate objects resolved in its favor, resulting in breakage.

The fix is simple: pick a (semi-arbitrary, non-empty) preferred pack
when none was specified. This forces that pack to have duplicates
resolved in its favor, and (critically) to sort first in pseudo-pack
order.  Unfortunately, testing this behavior portably isn't possible,
since it depends on readdir() order which isn't guaranteed by POSIX.

(Note that multi-pack reachability bitmaps have yet to be implemented;
so in that sense this patch is fixing a bug which does not yet exist.
But by having this patch beforehand, we can prevent the bug from ever
materializing.)

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 10:58:43 -07:00
5d3cd09a80 midx: reject empty --preferred-pack's
The soon-to-be-implemented multi-pack bitmap treats object in the first
bit position specially by assuming that all objects in the pack it was
selected from are also represented from that pack in the MIDX. In other
words, the pack from which the first object was selected must also have
all of its other objects selected from that same pack in the MIDX in
case of any duplicates.

But this assumption relies on the fact that there is at least one object
in that pack to begin with; otherwise the object in the first bit
position isn't from a preferred pack, in which case we can no longer
assume that all objects in that pack were also selected from the same
pack.

Guard this assumption by checking the number of objects in the given
preferred pack, and failing if the given pack is empty.

To make sure we can safely perform this check, open any packs which are
contained in an existing MIDX via prepare_midx_pack(). The same is done
for new packs via the add_pack_to_midx() callback, but packs picked up
from a previous MIDX will not yet have these opened.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 10:58:43 -07:00
f5909d34ca midx: clear auxiliary .rev after replacing the MIDX
When writing a new multi-pack index, write_midx_internal() attempts to
clean up any auxiliary files (currently just the MIDX's `.rev` file, but
soon to include a `.bitmap`, too) corresponding to the MIDX it's
replacing.

This step should happen after the new MIDX is written into place, since
doing so beforehand means that the old MIDX could be read without its
corresponding .rev file.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 10:58:43 -07:00
426c00e454 midx: fix *.rev cleanups with --object-dir
If using --object-dir to point into an object directory which belongs to
a different repository than the one in the current working directory,
such as:

  git init repo
  git -C repo ... # add some objects
  cd alternate
  git multi-pack-index --object-dir ../repo/.git/objects write

the binary will segfault trying to access the object-dir via the repo it
found, but that's not fully initialized. Worse, if we later call
clear_midx_files_ext(), we will use `the_repository` and remove files
out of the wrong object directory.

Fix this by using the given object_dir (or the object directory of
`the_repository` if `--object-dir` wasn't given) to properly to clean up
the *.rev files, avoiding the crash.

Original-patch-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 10:58:43 -07:00
73ff4ad086 midx: disallow running outside of a repository
The multi-pack-index command supports working with arbitrary object
directories via the `--object-dir` flag. Though this has historically
worked in arbitrary repositories (including when the command itself was
run outside of a Git repository), this has been somewhat of an accident.

For example, running:

    git multi-pack-index write --object-dir=/path/to/repo/objects

outside of a Git repository causes a BUG(). This is because the
top-level `cmd_multi_pack_index()` function stops parsing when it sees
"write", and then fills in the default object directory (the result of
calling `get_object_directory()`) before handing off to
`cmd_multi_pack_index_write()`. But there is no repository to
initialize, and so calling `get_object_directory()` results in a BUG()
(indicating that the current repository is not initialized).

Another case where this doesn't quite work as expected is when operating
in a SHA-256 repository. To see the failure, try this in your shell:

    git init --object-format=sha256 repo
    git -C repo commit --allow-empty base
    git -C repo repack -d

    git multi-pack-index --object-dir=$(pwd)/repo/.git/objects write

and observe that we cannot open the `.idx` file in "repo", because the
outermost process assumes that any repository that it works in also uses
the default value of `the_hash_algo` (at the time of writing, SHA-1).

There may be compelling reasons for trying to work around these bugs,
but working in arbitrary `--object-dir`'s is non-standard enough (and
likewise, these bugs prevalent enough) that I don't think any workflows
would be broken by abandoning this behavior.

Accordingly, restrict the `multi-pack-index` builtin to only work when
inside of a Git repository (i.e., its main utility becomes selecting
which alternate to operate in), which avoids both of the bugs above.

(Note that you can still trigger a bug when writing a MIDX in an
alternate which does not use the same object format as the repository
which it is an alternate of, but that is an unrelated bug to this one).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 10:58:43 -07:00
70afef5cdf upload-pack: use stdio in send_ref callbacks
In both protocol v0 and v2, upload-pack writes one pktline packet per
advertised ref to stdout. That means one or two write(2) syscalls per
ref. This is problematic if these writes become network sends with
high overhead.

This commit changes both send_ref callbacks to use buffered IO using
stdio.

To give an example of the impact: I set up a single-threaded loop that
calls ls-remote (with HTTP and protocol v2) on a local GitLab
instance, on a repository with 11K refs. When I switch from Git
v2.32.0 to this patch, I see a 40% reduction in CPU time for Git, and
65% for Gitaly (GitLab's Git RPC service).

So using buffered IO not only saves syscalls in upload-pack, it also
saves time in things that consume upload-pack's output.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 10:20:39 -07:00
96328398b3 pkt-line: add stdio packet write functions
This adds three new functions to pkt-line.c: packet_fwrite,
packet_fwrite_fmt and packet_fflush. Besides writing a pktline flush
packet, packet_fflush also flushes the stdio buffer of the stream.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 10:20:39 -07:00
53a66ec37c docs: clarify the interaction of transfer.hideRefs and namespaces
Expand the section about namespaces in the documentation of
`transfer.hideRefs` to point out the subtle differences between
`upload-pack` and `receive-pack`.

ffcfb68176 (upload-pack.c: treat want-ref relative to namespace,
2021-07-30) taught `upload-pack` to reject `want-ref`s for hidden refs,
which is now mentioned. It is clarified that at no point the name of a
hidden ref is revealed, but the object id it points to may.

Signed-off-by: Kim Altintop <kim@eagain.st>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 07:54:30 -07:00
3955140653 upload-pack.c: treat want-ref relative to namespace
When 'upload-pack' runs within the context of a git namespace, treat any
'want-ref' lines the client sends as relative to that namespace.

Also check if the wanted ref is hidden via 'hideRefs'. If it is hidden,
respond with an error as if the ref didn't exist.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Kim Altintop <kim@eagain.st>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 07:54:18 -07:00
bac01c6469 t5730: introduce fetch command helper
Assembling a "raw" fetch command to be fed directly to "test-tool serve-v2"
is extracted into a test helper.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kim Altintop <kim@eagain.st>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 07:54:00 -07:00
ccdd5d1eb1 mailmap.c: fix a memory leak in free_mailap_{info,entry}()
In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
clear the "me" structure, but while we freed parts of the
mailmap_entry structure, we didn't free the structure itself. The same
goes for the "mailmap_info" structure.

This brings the number of SANITIZE=leak failures in t4203-mailmap.sh
down from 50 to 49. Not really progress as far as the number of
failures is concerned, but as far as I can tell this fixes all leaks
in mailmap.c itself. There's still users of it such as builtin/log.c
that call read_mailmap() without a clear_mailmap(), but that's on
them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-31 12:38:09 -07:00
2c7f3aacd3 userdiff: support enum keyword in PHP hunk header
"enum" keyword will be introduced in PHP 8.1.
https://wiki.php.net/rfc/enumerations

Signed-off-by: USAMI Kenta <tadsan@zonu.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-31 12:13:36 -07:00
2f040a9671 fast-export: fix anonymized tag using original length
Commit 7f40759496 (fast-export: tighten anonymize_mem() interface to
handle only strings, 2020-06-23) changed the interface used in anonymizing
strings, but failed to update the size of annotated tag messages to match
the new anonymized string.

As a result, exporting tags having messages longer than 13 characters
would create output that couldn't be parsed by fast-import,
as the data length indicated was larger than the data output.

Reset the message size when anonymizing, and add a tag with a "long"
message to the test.

Signed-off-by: Tal Kelrich <hasturkun@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-31 12:11:57 -07:00
88682b016d protocol-caps.c: fix memory leak in send_info()
Fix a memory leak in a2ba162cda (object-info: support for retrieving
object info, 2021-04-20) which appears to have been based on a
misunderstanding of how the pkt-line.c API works. There is no need to
strdup() input to packet_writer_write(), it's just a printf()-like
format function.

This fixes a potentially large memory leak, since the number of OID
lines the "object-info" call can be arbitrarily large (or a small one
if the request is small).

This makes t5701-git-serve.sh pass again under SANITIZE=leak, as it
did before a2ba162cda.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Bruno Albuquerque <bga@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-31 11:15:16 -07:00
67f61efbb9 diff --submodule=diff: don't print failure message twice
When we fail to start a diff command inside a submodule, immediately
exit the routine rather than trying to finish the command and printing
a second message.

Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-31 10:12:13 -07:00
f1c0368da4 diff --submodule=diff: do not fail on ever-initialied deleted submodules
If you have ever initialized a submodule, open_submodule will open it.
If you then delete the submodule's worktree directory (but don't
remove it from .gitmodules), git diff --submodule=diff would error out
as it attempted to chdir into the now-deleted working tree directory.

This only matters if the submodules git dir is absorbed.  If not, then
we no longer have anywhere to run the diff.  But that case does not
trigger this error, because in that case, open_submodule fails, so we
don't resolve a left commit, so we exit early, which is the only thing
we could do.

If absorbed, then we can run the diff from the submodule's absorbed
git dir (.git/modules/sm2).  In practice, that's a bit more
complicated, because `git diff` expects to be run from inside a
working directory, not a git dir.  So it looks in the config for
core.worktree, and does chdir("../../../sm2"), which is the very dir
that we're trying to avoid visiting because it's been deleted.  We
work around this by setting GIT_WORK_TREE (and GIT_DIR) to ".".  It is
little weird to set GIT_WORK_TREE to something that is not a working
tree just to avoid an unnecessary chdir, but it works.

Signed-off-by: David Turner <dturner@twosigma.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-31 10:11:48 -07:00
367c5f36a6 commit-graph: show "unexpected subcommand" error
Bring the "commit-graph" command in line with the error output and
general pattern in cmd_multi_pack_index().

Let's test for that output, and also cover the same potential bug as
was fixed in the multi-pack-index command in
88617d11f9 (multi-pack-index: fix potential segfault without
sub-command, 2021-07-19).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 17:06:18 -07:00
6d209a01f8 commit-graph: show usage on "commit-graph [write|verify] garbage"
Change the parse_options() invocation in the commit-graph code to
error on unknown leftover argv elements, in addition to the existing
and implicit erroring via parse_options() on unknown options.

We'd already error in cmd_commit_graph() on e.g.:

    git commit-graph unknown verify
    git commit-graph --unknown verify

But here we're calling parse_options() twice more for the "write" and
"verify" subcommands. We did not do the same checking for leftover
argv elements there. As a result we'd silently accept garbage in these
subcommands, let's not do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 17:06:18 -07:00
070e7c5619 commit-graph: early exit to "usage" on !argc
Rather than guarding all of the !argc with an additional "if" arm
let's do an early goto to "usage". This also makes it clear that
"save_commit_buffer" is not needed in this case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 17:06:18 -07:00
92f480909f multi-pack-index: refactor "goto usage" pattern
Refactor the "goto usage" pattern added in
cd57bc41bb (builtin/multi-pack-index.c: display usage on unrecognized
command, 2021-03-30) and 88617d11f9 (multi-pack-index: fix potential
segfault without sub-command, 2021-07-19) to maintain the same
brevity, but in a form that doesn't run afoul of the recommendation in
CodingGuidelines about braces:

    When there are multiple arms to a conditional and some of them
    require braces, enclose even a single line block in braces for
    consistency[...]

Let's also change "argv == 0" to juts "!argv", per:

    Do not explicitly compare an integral value with constant 0 or
    '\0', or a pointer value with constant NULL[...]

I'm changing this because in a subsequent commit I'll make
builtin/commit-graph.c use the same pattern, having the two similarly
structured commands match aids readability.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 17:06:18 -07:00
84e4484f12 commit-graph: use parse_options_concat()
Make use of the parse_options_concat() so we don't need to copy/paste
common options like --object-dir.

This is inspired by a similar change to "checkout" in 2087182272
(checkout: split options[] array in three pieces, 2019-03-29), and the
same pattern in the multi-pack-index command, see
60ca94769c (builtin/multi-pack-index.c: split sub-commands,
2021-03-30).

A minor behavior change here is that now we're going to list both
--object-dir and --progress first, before we'd list --progress along
with other options.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 17:06:18 -07:00
8722f9fb6b commit-graph: remove redundant handling of -h
If we don't handle the -h option here like most parse_options() users
we'll fall through and it'll do the right thing for us.

I think this code added in 4ce58ee38d (commit-graph: create
git-commit-graph builtin, 2018-04-02) was always redundant,
parse_options() did this at the time, and the commit-graph code never
used PARSE_OPT_NO_INTERNAL_HELP.

We don't need a test for this, it's tested by the t0012-help.sh test
added in d691551192 (t0012: test "-h" with builtins, 2017-05-30).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 17:06:18 -07:00
8757b35d44 commit-graph: define common usage with a macro
Share the usage message between these three variables by using a
macro. Before this new options needed to copy/paste the usage
information, see e.g. 809e0327f5 (builtin/commit-graph.c: introduce
'--max-new-filters=<n>', 2020-09-18).

See b25b727494 (builtin/multi-pack-index.c: define common usage with
a macro, 2021-03-30) for another use of this pattern (but on-list this
one came first).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 17:06:18 -07:00
767a4ca648 sequencer: advise if skipping cherry-picked commit
Silently skipping commits when rebasing with --no-reapply-cherry-picks
(currently the default behavior) can cause user confusion. Issue
warnings when this happens, as well as advice on how to preserve the
skipped commits.

These warnings and advice are displayed only when using the (default)
"merge" rebase backend.

Update the git-rebase docs to mention the warnings and advice.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 16:35:36 -07:00
6c40894d24 The second batch
The most significant of this batch is of course "merge -sort".
Thanks, Elijah and everybody who helped the topic.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 16:06:22 -07:00
85e73cc8ac Merge branch 'cb/ci-freebsd-update'
Update FreeBSD CI job

* cb/ci-freebsd-update:
  ci: update freebsd 12 cirrus job
2021-08-30 16:06:06 -07:00
0d4f46b768 Merge branch 'tl/traverse-non-commits-rename'
Meh.

* tl/traverse-non-commits-rename:
  list-objects.c: rename "traverse_trees_and_blobs" to "traverse_non_commits"
2021-08-30 16:06:06 -07:00
bfd515ac56 Merge branch 'bc/t5607-avoid-broken-test-fail-prereqs'
The current implementation of GIT_TEST_FAIL_PREREQS is broken in
that checking for the lack of a prerequisite would not work.  Avoid
the use of "if ! test_have_prereq X" in a test script.

* bc/t5607-avoid-broken-test-fail-prereqs:
  t5607: avoid using prerequisites to select algorithm
2021-08-30 16:06:05 -07:00
a896086851 Merge branch 'th/userdiff-more-java'
The userdiff pattern for "java" language has been updated.

* th/userdiff-more-java:
  userdiff: improve java hunk header regex
2021-08-30 16:06:05 -07:00
fb0b14df65 Merge branch 'jk/range-diff-fixes'
"git range-diff" code clean-up.

* jk/range-diff-fixes:
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
2021-08-30 16:06:05 -07:00
7e3b9d1534 Merge branch 'jk/apply-binary-hunk-parsing-fix'
"git apply" miscounted the bytes and failed to read to the end of
binary hunks.

* jk/apply-binary-hunk-parsing-fix:
  apply: keep buffer/size pair in sync when parsing binary hunks
2021-08-30 16:06:04 -07:00
e1eb133476 Merge branch 'jc/userdiff-pattern-hint'
Remind developers that the userdiff patterns should be kept simple
and permissive, assuming that the contents they apply are always
syntactically correct.

* jc/userdiff-pattern-hint:
  userdiff: comment on the builtin patterns
2021-08-30 16:06:03 -07:00
669277c551 Merge branch 'cb/builtin-merge-format-string-fix'
Code clean-up.

* cb/builtin-merge-format-string-fix:
  builtin/merge: avoid -Wformat-extra-args from ancient Xcode
2021-08-30 16:06:03 -07:00
b81a85ecd8 Merge branch 'js/log-protocol-version'
Debugging aid.

* js/log-protocol-version:
  connect, protocol: log negotiated protocol version
2021-08-30 16:06:02 -07:00
8778fa8b4f Merge branch 'en/ort-becomes-the-default'
Use `ort` instead of `recursive` as the default merge strategy.

* en/ort-becomes-the-default:
  Update docs for change of default merge backend
  Change default merge backend from recursive to ort
2021-08-30 16:06:01 -07:00
aca13c2355 Merge branch 'en/merge-strategy-docs'
Documentation updates.

* en/merge-strategy-docs:
  Update error message and code comment
  merge-strategies.txt: add coverage of the `ort` merge strategy
  git-rebase.txt: correct out-of-date and misleading text about renames
  merge-strategies.txt: fix simple capitalization error
  merge-strategies.txt: avoid giving special preference to patience algorithm
  merge-strategies.txt: do not imply using copy detection is desired
  merge-strategies.txt: update wording for the resolve strategy
  Documentation: edit awkward references to `git merge-recursive`
  directory-rename-detection.txt: small updates due to merge-ort optimizations
  git-rebase.txt: correct antiquated claims about --rebase-merges
2021-08-30 16:06:01 -07:00
7d0daf3f12 Merge branch 'en/pull-conflicting-options'
"git pull" had various corner cases that were not well thought out
around its --rebase backend, e.g. "git pull --ff-only" did not stop
but went ahead and rebased when the history on other side is not a
descendant of our history.  The series tries to fix them up.

* en/pull-conflicting-options:
  pull: fix handling of multiple heads
  pull: update docs & code for option compatibility with rebasing
  pull: abort by default when fast-forwarding is not possible
  pull: make --rebase and --no-rebase override pull.ff=only
  pull: since --ff-only overrides, handle it first
  pull: abort if --ff-only is given and fast-forwarding is impossible
  t7601: add tests of interactions with multiple merge heads and config
  t7601: test interaction of merge/rebase/fast-forward flags and options
2021-08-30 16:06:01 -07:00
48072e3d68 clone: set submodule.recurse=true if submodule.stickyRecursiveClone enabled
Based on current experience, when running git clone --recurse-submodules,
developers do not expect other commands such as pull or checkout to run
recursively into active submodules. However, setting submodule.recurse=true
at this step could make for a simpler workflow by eliminating the need for
the --recurse-submodules option in subsequent commands. To collect more
data on developers' preference in regards to making submodule.recurse=true
a default config value in the future, deploy this feature under the opt in
submodule.stickyRecursiveClone flag.

Signed-off-by: Mahi Kolla <mkolla2@illinois.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 14:23:17 -07:00
e082113484 send-email: avoid incorrect header propagation
If multiple independent patches are sent with send-email, even if the
"In-Reply-To" and "References" headers are not managed by --thread or
--in-reply-to, their values may be propagated from prior patches to
subsequent patches with no such headers defined.

To mitigate this and potential future issues, make sure all global
patch-specific variables are always either handled by
command-specific code (e.g. threading), or are reset to their default
values for every iteration.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Marvin Häuser <mhaeuser@posteo.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 13:25:28 -07:00
f6bb64df82 fetch: skip formatting updated refs with --quiet
When fetching, Git will by default print a list of all updated refs in a
nicely formatted table. In order to come up with this table, Git needs
to iterate refs twice: first to determine the maximum column width, and
a second time to actually format these changed refs.

While this table will not be printed in case the user passes `--quiet`,
we still go out of our way and do all these steps. In fact, we even do
more work compared to not passing `--quiet`: without the flag, we will
skip all references in the column width computation which have not been
updated, but if it is set we will now compute widths for all refs.

Fix this issue by completely skipping both preparation of the format and
formatting data for display in case the user passes `--quiet`, improving
performance especially with many refs. The following benchmark shows a
nice speedup for a quiet mirror-fetch in a repository with 2.3M refs:

    Benchmark #1: HEAD~: git-fetch
      Time (mean ± σ):     26.929 s ±  0.145 s    [User: 24.194 s, System: 4.656 s]
      Range (min … max):   26.692 s … 27.068 s    5 runs

    Benchmark #2: HEAD: git-fetch
      Time (mean ± σ):     25.189 s ±  0.094 s    [User: 22.556 s, System: 4.606 s]
      Range (min … max):   25.070 s … 25.314 s    5 runs

    Summary
      'HEAD: git-fetch' ran
        1.07 ± 0.01 times faster than 'HEAD~: git-fetch'

While at it, this patch also fixes `adjust_refcol_width()` such that it
skips unchanged refs in case the user passed `--quiet`, where verbosity
will be negative. While this function won't be called anymore if so,
this brings the comment in line with actual code. Furthermore, needless
`verbosity >= 0` checks are now removed in `store_updated_refs()`: we
never print to the `note` buffer anymore in case `verbosity < 0`, so we
won't end up in that code block anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 10:13:55 -07:00
2dee7e6105 merge-recursive: use fspathcmp() in path_hashmap_cmp()
Call fspathcmp() instead of open-coding it.  This shortens the code and
makes it less repetitive.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 09:44:12 -07:00
614c3d8f2e test-lib: set GIT_CEILING_DIRECTORIES to protect the surrounding repository
Every once in a while a test somehow manages to escape from its trash
directory and modifies the surrounding repository, whether because of
a bug in git itself, a bug in a test [1], or e.g. when trying to run
tests with a shell that is, in general, unable to run our tests [2].

Set GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/.." as an additional
safety measure to protect the surrounding repository at least from
modifications by git commands executed in the tests (assuming that
handling of ceiling directories during repository discovery is not
broken, and, of course, it won't save us from regular shell commands,
e.g. 'cd .. && rm -f ...').

[1] e.g. https://public-inbox.org/git/20210423051255.GD2947267@szeder.dev
[2] $ git symbolic-ref HEAD
    refs/heads/master
    $ ksh ./t2011-checkout-invalid-head.sh
    [... a lot of "not ok" ...]
    $ git symbolic-ref HEAD
    refs/heads/other

    (In short: 'ksh' doesn't support the 'local' builtin command,
    which is used by 'test_oid', causing it to return with error
    whenever it's called, leaving ZERO_OID set to empty, so when the
    test 'checkout main from invalid HEAD' runs 'echo $ZERO_OID
    >.git/HEAD' it writes a corrupt (not invalid) HEAD, and subsequent
    git commands don't recognize the repository in the trash directory
    anymore, but operate on the surrounding repo.)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 09:42:49 -07:00
469888e6a5 doc: fix syntax error and the format of printf
Fix syntax and correct the format of printf in MyFirstObjectWalk.txt

Signed-off-by: Zoker <kaixuanguiqu@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 09:30:32 -07:00
d9e9b44d7a sparse-index: copy dir_hash in ensure_full_index()
Copy the 'index_state->dir_hash' back to the real istate after expanding
a sparse index.

A crash was observed in 'git status' during some hashmap lookups with
corrupted hashmap entries.  During an index expansion, new cache-entries
are added to the 'index_state->name_hash' and the 'dir_hash' in a
temporary 'index_state' variable 'full'.  However, only the 'name_hash'
hashmap from this temp variable was copied back into the real 'istate'
variable.  The original copy of the 'dir_hash' was incorrectly
preserved.  If the table in the 'full->dir_hash' hashmap were realloced,
the stale version (in 'istate') would be corrupted.

The test suite does not operate on index sizes sufficiently large to
trigger this reallocation, so they do not cover this behavior.
Increasing the test suite to cover such scale is fragile and likely
wasteful.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 09:24:12 -07:00
b0173340c6 builtin/pack-objects.c: remove duplicate hash lookup
In the original code from 08cdfb1337 (pack-objects --keep-unreachable,
2007-09-16), we add each object to the packing list with type
`obj->type`, where `obj` comes from `lookup_unknown_object()`. Unless we
had already looked up and parsed the object, this will be `OBJ_NONE`.
That's fine, since oe_set_type() sets the type_valid bit to '0', and we
determine the real type later on.

So the only thing we need from the object lookup is access to the
`flags` field so that we can mark that we've added the object with
`OBJECT_ADDED` to avoid adding it again (we can just pass `OBJ_NONE`
directly instead of grabbing it from the object).

But add_object_entry() already rejects duplicates! This has been the
behavior since 7a979d99ba (Thin pack - create packfile with missing
delta base., 2006-02-19), but 08cdfb1337 didn't take advantage of it.
Moreover, to do the OBJECT_ADDED check, we have to do a hash lookup in
`obj_hash`.

So we can drop the lookup_unknown_object() call completely, *and* the
OBJECT_ADDED flag, too, since the spot we're touching here is the only
location that checks it.

In the end, we perform the same number of hash lookups, but with the
added bonus that we don't waste memory allocating an OBJ_NONE object (if
we were traversing, we'd need it eventually, but the whole point of this
code path is not to traverse).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-29 23:25:43 -07:00
a9fd2f207d builtin/pack-objects.c: simplify add_objects_in_unpacked_packs()
This function is used to implement `pack-objects`'s `--keep-unreachable`
option, but can be simplified in a couple of ways:

  - add_objects_in_unpacked_packs() iterates over all packs (and then
    all packed objects) itself, but could use for_each_packed_object()
    instead since the missing flags necessary were added in the previous
    commit

  - objects are added to an in_pack array which store (off_t, object)
    tuples, and then sorted in offset order when we could iterate
    objects in offset order.

    There is a slight behavior change here: before we would have added
    objects in sorted offset order among _all_ packs. Handing objects to
    create_object_entry() in pack order for each pack (instead of
    feeding objects from all packs simultaneously their offset relative
    to different packs) is much more reasonable, if different than how
    the code currently works.

  - objects in a single pack are iterated in index order and searched
    for in order to discover their offsets, which is much less efficient
    than using the on-disk reverse index

Simplify the function by addressing each of the above and moving the
core of the loop into a callback function that we then pass to
for_each_packed_object() instead of open-coding the latter function
ourselves.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-29 23:25:20 -07:00
a241878ac7 object-store.h: teach for_each_packed_object to ignore kept packs
The next patch will reimplement a function that wants to iterate over
packed objects while ignoring packs which are marked as kept (either
in-core or on-disk).

Teach for_each_packed_object() to ignore all objects from those packs by
adding a new flag for each of the "kept" states that a pack can be in.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-29 23:23:39 -07:00
0834257379 bundle API: start writing API documentation
There are no other API docs in bundle.h, but this is at least a
start. We'll add a parameter to this function in a subsequent commit,
but let's start by documenting it.

The "/**" comment (as opposed to "/*") signifies the start of API
documentation. See [1] and bdfdaa4978 (strbuf.h: integrate
api-strbuf.txt documentation, 2015-01-16) and 6afbbdda33 (strbuf.h:
unify documentation comments beginnings, 2015-01-16) for a discussion
of that convention.

1. https://lore.kernel.org/git/874kbeecfu.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-27 18:07:27 -07:00
597a977489 branch: allow deleting dangling branches with --force
git branch only allows deleting branches that point to valid commits.
Skip that check if --force is given, as the caller is indicating with
it that they know what they are doing and accept the consequences.
This allows deleting dangling branches, which previously had to be
reset to a valid start-point using --force first.

Reported-by: Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-27 15:11:18 -07:00
e124ecf7f7 archive: convert queue_directory to struct object_id
Pass the struct object_id on instead of just its hash member.
This is simpler and avoids the need to guess the algorithm.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-27 14:19:00 -07:00
e4f8d27585 show-branch: simplify rev_is_head()
Only one of the callers of rev_is_head() provides two hashes to compare.
Move that check there and convert it to struct object_id.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-27 14:12:15 -07:00
1e93770888 docs: use "character encoding" to refer to commit-object encoding
The word "encoding" can mean a lot of things (e.g., base64 or
quoted-printable encoding in emails, HTML entities, URL encoding, and so
on). The documentation for i18n.commitEncoding and i18n.logOutputEncoding
uses the phrase "character encoding" to make this more clear.

Let's use that phrase in other places to make it clear what kind of
encoding we are talking about. This patch covers the gui.encoding
option, as well as the --encoding option for git-log, etc (in this
latter case, I word-smithed the sentence a little at the same time).
That, coupled with the mention of iconv in the --encoding description,
should make this more clear.

The other spot I looked at is the working-tree-encoding section of
gitattributes(5). But it gives specific examples of encodings that I
think make the meaning pretty clear already.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-27 12:45:45 -07:00
fd680bc558 logmsg_reencode(): warn when iconv() fails
If the user asks for a pretty-printed commit to be converted (either
explicitly with --encoding=foo, or implicitly because the commit is
non-utf8 and we want to convert it), we pass it through iconv(). If that
fails, we fall back to showing the input verbatim, but don't tell the
user that the output may be bogus.

Let's add a warning to do so, along with a mention in the documentation
for --encoding. Two things to note about the implementation:

  - we could produce the warning closer to the call to iconv() in
    reencode_string_len(), which would let us relay the value of errno.
    But this is not actually very helpful. reencode_string_len() does
    not know we are operating on a commit, and indeed does not know that
    the caller won't produce an error of its own. And the errno values
    from iconv() are seldom helpful (iconv_open() only ever produces
    EINVAL; perhaps EILSEQ from iconv() might be illuminating, but it
    can also return EINVAL for incomplete sequences).

  - if the reason for the failure is that the output charset is not
    supported, then the user will see this warning for every commit we
    try to display. That might be ugly and overwhelming, but on the
    other hand it is making it clear that every one of them has not been
    converted (and the likely outcome anyway is to re-try the command
    with a supported output encoding).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-27 12:43:22 -07:00
7a132c628e checkout: make delayed checkout respect --quiet and --no-progress
The 'Filtering contents...' progress report from delayed checkout is
displayed even when checkout and clone are invoked with --quiet or
--no-progress. Furthermore, it is displayed unconditionally, without
first checking whether stdout is a tty. Let's fix these issues and also
add some regression tests for the two code paths that currently use
delayed checkout: unpack_trees.c:check_updates() and
builtin/checkout.c:checkout_worktree().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-26 23:15:33 -07:00
c93ca46cf5 column: fix parsing of the '--nl' option
'git column's '--nl' option can be used to specify a "string to be
printed at the end of each line" (quoting the man page), but this
option and its mandatory argument has been parsed as OPT_INTEGER since
the introduction of the command in 7e29b8254f (Add column layout
skeleton and git-column, 2012-04-21).  Consequently, any non-number
argument is rejected by parse-options, and any number other than 0
leads to segfault:

  $ printf "%s\n" one two |git column --mode=plain --nl=foo
  error: option `nl' expects a numerical value
  $ printf "%s\n" one two |git column --mode=plain --nl=42
  Segmentation fault (core dumped)
  $ printf "%s\n" one two |git column --mode=plain --nl=0
  one
  two

Parse this option as OPT_STRING.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-26 14:36:27 -07:00
f54b9f21ca ls-refs: reuse buffer when sending refs
In the initial reference advertisement, the Git server will first
announce all of its references to the client. The logic is handled in
`send_ref()`, which will allocate a new buffer for each refline it is
about to send. This is quite wasteful: instead of allocating a new
buffer each time, we can just reuse a buffer.

Improve this by passing in a buffer via the `ls_refs_data` struct which
is then reused on each reference.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 15:55:29 -07:00
66e905b7dd use xopen() to handle fatal open(2) failures
Add and apply a semantic patch for using xopen() instead of calling
open(2) and die() or die_errno() explicitly.  This makes the error
messages more consistent and shortens the code.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 14:39:08 -07:00
a7439d0f9d xopen: explicitly report creation failures
If the flags O_CREAT and O_EXCL are both given then open(2) is supposed
to create the file and error out if it already exists.  The error
message in that case looks like this:

	fatal: could not open 'foo' for writing: File exists

Without further context this is confusing: Why should the existence of
the file pose a problem?  Isn't that a requirement for writing to it?

Add a more specific error message for that case to tell the user that we
actually don't expect the file to preexist, so the example becomes:

	fatal: unable to create 'foo': File exists

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 14:39:06 -07:00
5b12e16bb1 refs: make errno output explicit for read_raw_ref_fn
This makes it explicit how alternative ref backends should report errors in
read_raw_ref_fn.

read_raw_ref_fn needs to supply a credible errno for a number of cases. These
are primarily:

1) The files backend calls read_raw_ref from lock_raw_ref, and uses the
resulting error codes to create/remove directories as needed.

2) ENOENT should be translated in a zero OID, optionally with REF_ISBROKEN set,
returning the last successfully resolved symref. This is necessary so
read_raw_ref("HEAD") on an empty repo returns refs/heads/main (or the default branch
du-jour), and we know on which branch to create the first commit.

Make this information flow explicit by adding a failure_errno to the signature
of read_raw_ref. All errnos from the files backend are still propagated
unchanged, even though inspection suggests only ENOTDIR, EISDIR and ENOENT are
relevant.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:30:26 -07:00
1ae6ed230a refs/files-backend: stop setting errno from lock_ref_oid_basic
refs/files-backend.c::lock_ref_oid_basic() tries to signal how it failed
to its callers using errno.

It is safe to stop setting errno here, because the callers of this
file-scope static function are

* files_copy_or_rename_ref()
* files_create_symref()
* files_reflog_expire()

None of them looks at errno after seeing a negative return from
lock_ref_oid_basic() to make any decision, and no caller of these three
functions looks at errno after they signal a failure by returning a
negative value. In particular,

* files_copy_or_rename_ref() - here, calls are followed by error()
(which performs I/O) or write_ref_to_lockfile() (which calls
parse_object() which may perform I/O)

* files_create_symref() - here, calls are followed by error() or
create_symref_locked() (which performs I/O and does not inspect
errno)

* files_reflog_expire() - here, calls are followed by error() or
refs_reflog_exists() (which calls a function in a vtable that is not
documented to use and/or preserve errno)

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:30:26 -07:00
20d422cfd7 refs: remove EINVAL errno output from specification of read_raw_ref_fn
This commit does not change code; it documents the fact that an alternate ref
backend does not need to return EINVAL from read_raw_ref_fn to function
properly.

This is correct, because refs_read_raw_ref is only called from;

* resolve_ref_unsafe(), which does not care for the EINVAL errno result.

* refs_verify_refname_available(), which does not inspect errno.

* files-backend.c, where errno is overwritten on failure.

* packed-backend.c (is_packed_transaction_needed), which calls it for the
  packed ref backend, which never emits EINVAL.

A grep for EINVAL */*c reveals that no code checks errno against EINVAL after
reading references. In addition, the refs.h file does not mention errno at all.

A grep over resolve_ref_unsafe() turned up the following callers that inspect
errno:

* sequencer.c::print_commit_summary, which uses it for die_errno

* lock_ref_oid_basic(), which only treats EISDIR and ENOTDIR specially.

The files ref backend does use EINVAL. The files backend does not call into
the generic API (refs_read_raw), but into the files-specific function
(files_read_raw_ref), which we are not changing in this commit.

As the errno sideband is unintuitive and error-prone, remove EINVAL
value, as a step towards getting rid of the errno sideband altogether.

Spotted by Ævar Arnfjörð Bjarmason <avarab@gmail.com>.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:30:26 -07:00
3fa2e91d17 refs file backend: move raceproof_create_file() here
Move the raceproof_create_file() API added to cache.h and
object-file.c in 177978f56a (raceproof_create_file(): new function,
2017-01-06) to its only user, refs/files-backend.c.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:30:26 -07:00
48cdcd9ca0 refs/files: remove unused "errno != ENOTDIR" condition
As a follow-up to the preceding commit where we removed the adjacent
"errno == EISDIR" condition in the same function, remove the
"last_errno != ENOTDIR" condition here.

It's not possible for us to hit this condition added in
5b2d8d6f21 (lock_ref_sha1_basic(): improve diagnostics for ref D/F
conflicts, 2015-05-11). Since a1c1d8170d (refs_resolve_ref_unsafe:
handle d/f conflicts for writes, 2017-10-06) we've explicitly caught
these in refs_resolve_ref_unsafe() before returning NULL:

	if (errno != ENOENT &&
	    errno != EISDIR &&
	    errno != ENOTDIR)
		return NULL;

We'd then always return the refname from refs_resolve_ref_unsafe()
even if we were in a broken state as explained in the preceding
commit. The elided context here is a call to refs_resolve_ref_unsafe().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
245fbba46d refs/files: remove unused "errno == EISDIR" code
When we lock a reference like "foo" we need to handle the case where
"foo" exists, but is an empty directory. That's what this code added
in bc7127ef0f (ref locking: allow 'foo' when 'foo/bar' used to exist
but not anymore., 2006-09-30) seems like it should be dealing with.

Except it doesn't, and we never take this branch. The reason is that
when bc7127ef0f was written this looked like:

	ref = resolve_ref([...]);
	if (!ref && errno == EISDIR) {
	[...]

And in resolve_ref() we had this code:

	fd = open(path, O_RDONLY);
	if (fd < 0)
		return NULL;

I.e. we would attempt to read "foo" with open(), which would fail with
EISDIR and we'd return NULL. We'd then take this branch, call
remove_empty_directories() and continue.

Since a1c1d8170d (refs_resolve_ref_unsafe: handle d/f conflicts for
writes, 2017-10-06) we don't. E.g. in the case of
files_copy_or_rename_ref() our callstack will look something like:

	[...] ->
	files_copy_or_rename_ref() ->
	lock_ref_oid_basic() ->
	refs_resolve_ref_unsafe()

At that point the first (now only) refs_resolve_ref_unsafe() call in
lock_ref_oid_basic() would do the equivalent of this in the resulting
call to refs_read_raw_ref() in refs_resolve_ref_unsafe():

	/* Via refs_read_raw_ref() */
	fd = open(path, O_RDONLY);
	if (fd < 0)
		/* get errno == EISDIR */
	/* later, in refs_resolve_ref_unsafe() */
	if ([...] && errno != EISDIR)
		return NULL;
	[...]
	/* returns the refs/heads/foo to the caller, even though it's a directory */
	return refname;

I.e. even though we got an "errno == EISDIR" we won't take this
branch, since in cases of EISDIR "resolved" is always
non-NULL. I.e. we pretend at this point as though everything's OK and
there is no "foo" directory.

We then proceed with the entire ref update and don't call
remove_empty_directories() until we call commit_ref_update(). See
5387c0d883 (commit_ref(): if there is an empty dir in the way, delete
it, 2016-05-05) for the addition of that code, and
a1c1d8170d (refs_resolve_ref_unsafe: handle d/f conflicts for writes,
2017-10-06) for the commit that changed the original codepath added in
bc7127ef0f to use this "EISDIR" handling.

Further historical commentary:

Before the two preceding commits the caller in files_reflog_expire()
was the only one out of our 4 callers that would pass non-NULL as an
oid. We would then set a (now gone) "resolve_flags" to
"RESOLVE_REF_READING" and just before that "errno != EISDIR" check do:

	if (resolve_flags & RESOLVE_REF_READING)
		return NULL;

There may have been some case where this ended up mattering and we
couldn't safely make this change before we removed the "oid"
parameter, but I don't think there was, see [1] for some discussion on
that.

In any case, now that we've removed the "oid" parameter in a preceding
commit we can be sure that this code is redundant, so let's remove it.

1. http://lore.kernel.org/git/871r801yp6.fsf@evledraar.gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
ff7a2e4dbb refs/files: remove unused "oid" in lock_ref_oid_basic()
In the preceding commit the last caller that passed a non-NULL OID was
changed to pass NULL to lock_ref_oid_basic(). As noted in preceding
commits use of this API has been going away (we should use ref
transactions, or lock_raw_ref()), so we're unlikely to gain new
callers that want to pass the "oid".

So let's remove it, doing so means we can remove the "mustexist"
condition, and therefore anything except the "flags =
RESOLVE_REF_NO_RECURSE" case.

Furthermore, since the verify_lock() function we called did most of
its work when the "oid" was passed (as "old_oid") we can inline the
trivial part of it that remains in its only remaining caller. Without
a NULL "oid" passed it was equivalent to calling refs_read_ref_full()
followed by oidclr().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
cc40b5ce13 refs API: remove OID argument to reflog_expire()
Since the the preceding commit the "oid" parameter to reflog_expire()
is always NULL, but it was not cleaned up to reduce the size of the
diff. Let's do that subsequent API and documentation cleanup now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
ae35e16cd4 reflog expire: don't lock reflogs using previously seen OID
During reflog expiry, the cmd_reflog_expire() function first iterates
over all reflogs in logs/*, and then one-by-one acquires the lock for
each one and expires it. This behavior has been with us since this
command was implemented in 4264dc15e1 ("git reflog expire",
2006-12-19).

Change this to stop calling lock_ref_oid_basic() with the OID we saw
when we looped over the logs, instead have it pass the OID it managed
to lock.

This mostly mitigates a race condition where e.g. "git gc" will fail
in a concurrently updated repository because the branch moved since
"git reflog expire --all" was started. I.e. with:

    error: cannot lock ref '<refname>': ref '<refname>' is at <OID-A> but expected <OID-B>

This behavior of passing in an "oid" was needed for an edge-case that
I've untangled in this and preceding commits though, namely that we
needed this OID because we'd:

 1. Lookup the reflog name/OID via dwim_log()
 2. With that OID, lock the reflog
 3. Later in builtin/reflog.c we use the OID we looked as input to
    lookup_commit_reference_gently(), assured that it's equal to the
    OID we got from dwim_log().

We can be sure that this change is safe to make because between
dwim_log (step #1) and lock_ref_oid_basic (step #2) there was no other
logic relevant to the OID or expiry run in the cmd_reflog_expire()
caller.

We can thus treat that code as a black box, before and after this
change it would get an OID that's been locked, the only difference is
that now we mostly won't be failing to get the lock due to the TOCTOU
race[0]. That failure was purely an implementation detail in how the
"current OID" was looked up, it was divorced from the locking
mechanism.

What do we mean with "mostly"? It mostly mitigates it because we'll
still run into cases where the ref is locked and being updated as we
want to expire it, and other git processes wanting to update the refs
will in turn race with us as we expire the reflog.

That remaining race can in turn be mitigated with the
core.filesRefLockTimeout setting, see 4ff0f01cb7 ("refs: retry
acquiring reference locks for 100ms", 2017-08-21). In practice if that
value is high enough we'll probably never have ref updates or reflog
expiry failing, since the clients involved will retry for far longer
than the time any of those operations could take.

See [1] for an initial report of how this impacted "git gc" and a
large discussion about this change in early 2019. In particular patch
looked good to Michael Haggerty, see his[2]. That message seems to not
have made it to the ML archive, its content is quoted in full in my
[3].

I'm leaving behind now-unused code the refs API etc. that takes the
now-NULL "unused_oid" argument, and other code that can be simplified now
that we never have on OID in that context, that'll be cleaned up in
subsequent commits, but for now let's narrowly focus on fixing the
"git gc" issue. As the modified assert() shows we always pass a NULL
oid to reflog_expire() now.

Unfortunately this sort of probabilistic contention is hard to turn
into a test. I've tested this by running the following three subshells
in concurrent terminals:

    (
        rm -rf /tmp/git &&
        git init /tmp/git &&
        while true
        do
            head -c 10 /dev/urandom | hexdump >/tmp/git/out &&
            git -C /tmp/git add out &&
            git -C /tmp/git commit -m"out"
        done
    )

    (
	rm -rf /tmp/git-clone &&
        git clone file:///tmp/git /tmp/git-clone &&
        while git -C /tmp/git-clone pull
        do
            date
        done
    )

    (
        while git -C /tmp/git-clone reflog expire --all
        do
            date
        done
    )

Before this change the "reflog expire" would fail really quickly with
the "but expected" error noted above.

After this change both the "pull" and "reflog expire" will run for a
while, but eventually fail because I get unlucky with
core.filesRefLockTimeout (the "reflog expire" is in a really tight
loop). As noted above that can in turn be mitigated with higher values
of core.filesRefLockTimeout than the 100ms default.

As noted in the commentary added in the preceding commit there's also
the case of branches being racily deleted, that can be tested by
adding this to the above:

    (
        while git -C /tmp/git-clone branch topic master &&
	      git -C /tmp/git-clone branch -D topic
        do
            date
        done
    )

With core.filesRefLockTimeout set to 10 seconds (it can probably be a
lot lower) I managed to run all four of these concurrently for about
an hour, and accumulated ~125k commits, auto-gc's and all, and didn't
have a single failure. The loops visibly stall while waiting for the
lock, but that's expected and desired behavior.

0. https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
1. https://lore.kernel.org/git/87tvg7brlm.fsf@evledraar.gmail.com/
2. http://lore.kernel.org/git/b870a17d-2103-41b8-3cbc-7389d5fff33a@alum.mit.edu
3. https://lore.kernel.org/git/87pnqkco8v.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
7aa7829f75 refs/files: add a comment about refs_reflog_exists() call
Add a comment about why it is that we need to check for the the
existence of a reflog we're deleting after we've successfully acquired
the lock in files_reflog_expire(). As noted in [1] the lock protocol
for reflogs is somewhat intuitive.

This early exit code the comment applies to dates all the way back to
4264dc15e1 (git reflog expire, 2006-12-19).

1. https://lore.kernel.org/git/54DCDA42.2060800@alum.mit.edu/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
6f45ec88d2 refs: make repo_dwim_log() accept a NULL oid
Change the repo_dwim_log() function initially added as dwim_log() in
eb3a48221f (log --reflog: use dwim_log, 2007-02-09) to accept a NULL
oid parameter. The refs_resolve_ref_unsafe() function it invokes
already deals with it, but it didn't.

This allows for a bit more clarity in a reflog-walk.c codepath added
in f2eba66d4d (Enable HEAD@{...} and make it independent from the
current branch, 2007-02-03). We'll shortly use this in
builtin/reflog.c as well.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
81bc122589 refs/debug: re-indent argument list for "prepare"
Re-indent this argument list that's been mis-indented since it was
added in 34c319970d (refs/debug: trace into reflog expiry too,
2021-04-23). This makes a subsequent change smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
640d9d55c3 refs/files: remove unused "skip" in lock_raw_ref() too
Remove the unused "skip" parameter to lock_raw_ref(), it was never
used. We do use it when passing "skip" to the
refs_rename_ref_available() function in files_copy_or_rename_ref(),
but not here.

This is part of a larger series that modifies lock_ref_oid_basic()
extensively, there will be no more modifications of this function in
this series, but since the preceding commit removed this unused
parameter from lock_ref_oid_basic(), let's do it here too for
consistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
11e984da07 refs/files: remove unused "extras/skip" in lock_ref_oid_basic()
The lock_ref_oid_basic() function has gradually been replaced by use
of the file transaction API, there are only 4 remaining callers of
it.

None of those callers pass non-NULL "extras" and "skip" parameters,
the last such caller went away in 92b1551b1d (refs: resolve symbolic
refs first, 2016-04-25), so let's remove the parameters.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
491ad946b2 refs: drop unused "flags" parameter to lock_ref_oid_basic()
In the last commit we removed the REF_DELETING flag from
lock_ref_oid_basic(). Since then all of the remaining callers do pass
REF_NO_DEREF, but that has been ignored completely since
7a418f3a17 (lock_ref_sha1_basic(): only handle REF_NODEREF mode,
2016-04-22).

So we can simply get rid of the parameter entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 13:27:37 -07:00
ab628588f8 advice: move advice.graftFileDeprecated squashing to commit.[ch]
Move the squashing of the advice.graftFileDeprecated advice over to an
external variable in commit.[ch], allowing advice() to purely use the
new-style API of invoking advice() with an enum.

See 8821e90a09 (advice: don't pointlessly suggest
--convert-graft-file, 2018-11-27) for why quieting this advice was
needed. It's more straightforward to move this code to commit.[ch] and
use it builtin/replace.c, than to go through the indirection of
advice.[ch].

Because this was the last advice_config variable we can remove that
old facility from advice.c.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 12:07:52 -07:00
c2a4b6d4ee advice: remove use of global advice_add_embedded_repo
The external use of this variable was added in 532139940c (add: warn
when adding an embedded repository, 2017-06-14). For the use-case it's
more straightforward to track whether we've shown advice in
check_embedded_repo() than setting the global variable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 12:07:52 -07:00
ed9bff0817 advice: remove read uses of most global advice_ variables
In c4a09cc9cc (Merge branch 'hw/advise-ng', 2020-03-25), a new API for
accessing advice variables was introduced and deprecated `advice_config`
in favor of a new array, `advice_setting`.

This patch ports all but two uses which read the status of the global
`advice_` variables over to the new `advice_enabled` API. We'll deal
with advice_add_embedded_repo and advice_graft_file_deprecated
separately.

Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 12:07:52 -07:00
69290551b9 advice: add enum variants for missing advice variables
In daef1b300b (Merge branch 'hw/advice-add-nothing', 2020-02-14), two
advice settings were introduced into the `advice_config` array.

Subsequently, c4a09cc9cc (Merge branch 'hw/advise-ng', 2020-03-25)
started to deprecate `advice_config` in favor of a new array,
`advice_setting`.

However, the latter branch did not include the former branch, and
therefore `advice_setting` is missing the two entries added by the
`hw/advice-add-nothing` branch.

These are currently the only entries in `advice_config` missing from
`advice_setting`.

Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 12:07:51 -07:00
8174627b3d diff-lib: ignore paths that are outside $cwd if --relative asked
For diff family commands, we can tell them to exclude changes outside
of some directories if --relative is requested.

In diff_unmerge(), NULL will be returned if the requested path is
outside of the interesting directories, thus we'll run into NULL
pointer dereference in run_diff_files when trying to dereference
its return value.

Checking for return value of diff_unmerge before dereferencing
is not sufficient, though. Since, diff engine will try to work on such
pathspec later.

Let's not run diff on those unintesting entries, instead.
As a side effect, by skipping like that, we can save some CPU cycles.

Reported-by: Thomas De Zeeuw <thomas@slight.dev>
Tested-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 11:49:36 -07:00
1549577338 t6300: check for cat-file exit status code
In test_atom(), we're piping the output of cat-file to tail(1),
thus, losing its exit status.

Let's use a temporary file to preserve git exit status code.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 11:45:54 -07:00
597fa8cb43 t6300: don't run cat-file on non-existent object
In t6300, some tests are guarded behind some prerequisites.
Thus, objects created by those tests ain't available if those
prerequisites are unsatistified.  Attempting to run "cat-file"
on those objects will run into failure.

In fact, running t6300 in an environment without gpg(1),
we'll see those warnings:

	fatal: Not a valid object name refs/tags/signed-empty
	fatal: Not a valid object name refs/tags/signed-short
	fatal: Not a valid object name refs/tags/signed-long

Let's put those commands into the real tests, in order to:

* skip their execution if prerequisites aren't satistified.
* check their exit status code

The expected value for objects with type: commit needs to be
computed outside the test because we can't rely on "$3" there.
Furthermore, to prevent the accidental usage of that computed
expected value, BUG out on unknown object's type.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 11:45:53 -07:00
5146c2f148 credential: fix leak in credential_apply_config()
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 11:41:30 -07:00
c21b2511c2 t5323: drop mentions of "master"
Commit 0696232390 (pack-redundant: fix crash when one packfile in repo,
2020-12-16) added one some new tests to t5323. At the time, the sub-repo
we used was called "master". But in a parallel branch, this was switched
to "main".

When the latter branch was merged in 27d7c8599b (Merge branch
'js/default-branch-name-tests-final-stretch', 2021-01-25), some of those
spots caused textual conflicts, but some (for tests that were far enough
away from other changed code) were just semantic. The merge resolution
fixed up most spots, but missed this one.

Even though this did impact actual code, it turned out not to fail the
tests. Running 'cd "$master_repo"' ended up staying in the same
directory, running the test in the main trash repo instead of the
sub-repo. But because the point of the test is checking behavior when
there are no packfiles, it worked in either repo (since both are empty
at this point in the script).

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 09:08:18 -07:00
c4203212e3 The first batch post 2.33
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 15:33:23 -07:00
1b2be06e04 Merge branch 'ps/fetch-pack-load-refs-optim'
Loading of ref tips to prepare for common ancestry negotiation in
"git fetch-pack" has been optimized by taking advantage of the
commit graph when available.

* ps/fetch-pack-load-refs-optim:
  fetch-pack: speed up loading of refs via commit graph
2021-08-24 15:32:41 -07:00
066f6cd447 Merge branch 'jt/push-negotiation-fixes'
Bugfix for common ancestor negotiation recently introduced in "git
push" code path.

* jt/push-negotiation-fixes:
  fetch: die on invalid --negotiation-tip hash
  send-pack: fix push nego. when remote has refs
  send-pack: fix push.negotiate with remote helper
2021-08-24 15:32:40 -07:00
6f64eeab60 Merge branch 'es/trace2-log-parent-process-name'
trace2 logs learned to show parent process name to see in what
context Git was invoked.

* es/trace2-log-parent-process-name:
  tr2: log parent process name
  tr2: make process info collection platform-generic
2021-08-24 15:32:40 -07:00
276bc6357e Merge branch 'hn/refs-test-cleanup'
A handful of tests that assumed implementation details of files
backend for refs have been cleaned up.

* hn/refs-test-cleanup:
  t6001: avoid direct file system access
  t6500: use "ls -1" to snapshot ref database state
  t7064: use update-ref -d to remove upstream branch
  t1410: mark test as REFFILES
  t1405: mark test for 'git pack-refs' as REFFILES
  t1405: use 'git reflog exists' to check reflog existence
  t2402: use ref-store test helper to create broken symlink
  t3320: use git-symbolic-ref rather than filesystem access
  t6120: use git-update-ref rather than filesystem access
  t1503: mark symlink test as REFFILES
  t6050: use git-update-ref rather than filesystem access
2021-08-24 15:32:39 -07:00
08ac213965 Merge branch 'en/ort-perf-batch-15'
Final batch for "merge -sort" optimization.

* en/ort-perf-batch-15:
  merge-ort: remove compile-time ability to turn off usage of memory pools
  merge-ort: reuse path strings in pool_alloc_filespec
  merge-ort: store filepairs and filespecs in our mem_pool
  diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc
  merge-ort: switch our strmaps over to using memory pools
  merge-ort: set up a memory pool
  merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers
  diffcore-rename: use a mem_pool for exact rename detection's hashmap
  merge-ort: rename str{map,intmap,set}_func()
2021-08-24 15:32:39 -07:00
aab0eeaba5 Merge branch 'js/expand-runtime-prefix'
Pathname expansion (like "~username/") learned a way to specify a
location relative to Git installation (e.g. its $sharedir which is
$(prefix)/share), with "%(prefix)".

* js/expand-runtime-prefix:
  expand_user_path: allow in-flight topics to keep using the old name
  interpolate_path(): allow specifying paths relative to the runtime prefix
  Use a better name for the function interpolating paths
  expand_user_path(): clarify the role of the `real_home` parameter
  expand_user_path(): remove stale part of the comment
  tests: exercise the RUNTIME_PREFIX feature
2021-08-24 15:32:38 -07:00
f19b2752e7 Merge branch 'ab/bundle-doc'
Doc update.

* ab/bundle-doc:
  bundle doc: replace "basis" with "prerequsite(s)"
  bundle doc: elaborate on rev<->ref restriction
  bundle doc: elaborate on object prerequisites
  bundle doc: rewrite the "DESCRIPTION" section
2021-08-24 15:32:38 -07:00
bda891e664 Merge branch 'zh/ref-filter-raw-data'
Prepare the "ref-filter" machinery that drives the "--format"
option of "git for-each-ref" and its friends to be used in "git
cat-file --batch".

* zh/ref-filter-raw-data:
  ref-filter: add %(rest) atom
  ref-filter: use non-const ref_format in *_atom_parser()
  ref-filter: --format=%(raw) support --perl
  ref-filter: add %(raw) atom
  ref-filter: add obj-type check in grab contents
2021-08-24 15:32:37 -07:00
5c933f0155 Merge branch 'ab/pack-stdin-packs-fix'
Input validation of "git pack-objects --stdin-packs" has been
corrected.

* ab/pack-stdin-packs-fix:
  pack-objects: fix segfault in --stdin-packs option
  pack-objects tests: cover blindspots in stdin handling
2021-08-24 15:32:36 -07:00
e48a623dea Merge branch 'ab/http-drop-old-curl'
Support for ancient versions of cURL library (pre 7.19.4) has been
dropped.

* ab/http-drop-old-curl:
  http: rename CURLOPT_FILE to CURLOPT_WRITEDATA
  http: drop support for curl < 7.19.3 and < 7.17.0 (again)
  http: drop support for curl < 7.19.4
  http: drop support for curl < 7.16.0
  http: drop support for curl < 7.11.1
2021-08-24 15:32:36 -07:00
2f71366878 Merge branch 'ds/add-with-sparse-index'
"git add" can work better with the sparse index.

* ds/add-with-sparse-index:
  add: remove ensure_full_index() with --renormalize
  add: ignore outside the sparse-checkout in refresh()
  pathspec: stop calling ensure_full_index
  add: allow operating on a sparse-only index
  t1092: test merge conflicts outside cone
2021-08-24 15:32:35 -07:00
ae2d05d0c6 Merge branch 'jc/bisect-sans-show-branch'
"git bisect" spawned "git show-branch" only to pretty-print the
title of the commit after checking out the next version to be
tested; this has been rewritten in C.

* jc/bisect-sans-show-branch:
  bisect: simplify return code from bisect_checkout()
  bisect: do not run show-branch just to show the current commit
2021-08-24 15:32:35 -07:00
0160f7e725 rebase: emit one "fatal" in "fatal: fatal: <error>"
The die() routine adds a "fatal: " prefix, there is no reason to add
another one. Fixes code added in e65123a71d (builtin rebase: support
`git rebase <upstream> <switch-to>`, 2018-09-04).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:48:16 -07:00
f58c7468cd ls-remote: set packet_trace_identity(<name>)
Set packet_trace_identity() for ls-remote. This replaces the generic
"git" identity in GIT_TRACE_PACKET=<file> traces to "ls-remote", e.g.:

    [...] packet:  upload-pack> version 2
    [...] packet:  upload-pack> agent=git/2.32.0-dev
    [...] packet:    ls-remote< version 2
    [...] packet:    ls-remote< agent=git/2.32.0-dev

Where in an "git ls-remote file://<path>" dialog ">" is the sender (or
"to the server") and "<" is the recipient (or "received by the
client").

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:47:07 -07:00
95b4ff3931 compat: let git_mmap use malloc(3) directly
xmalloc() dies on error, allows zero-sized allocations and enforces
GIT_ALLOC_LIMIT for testing.  Our mmap replacement doesn't need any of
that.  Let's cut out the wrapper, reject zero-sized requests as required
by POSIX and use malloc(3) directly.  Allocation errors were needlessly
handled by git_mmap() before; this code becomes reachable now.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:43:45 -07:00
a16eb6b1ff maintenance: skip bootout/bootstrap when plist is registered
On macOS, we use launchctl to manage the background maintenance
schedule. This uses a set of .plist files to describe the schedule, but
these files are also registered with 'launchctl bootstrap'. If multiple
'git maintenance start' commands run concurrently, then they can collide
replacing these schedule files and registering them with launchctl.

To avoid extra launchctl commands, do a check for the .plist files on
disk and check if they are registered using 'launchctl list <name>'.
This command will return with exit code 0 if it exists, or exit code 113
if it does not.

We can test this behavior using the GIT_TEST_MAINT_SCHEDULER environment
variable.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:16:58 -07:00
bb01122a82 maintenance: create launchctl configuration using a lock file
When two `git maintenance` processes try to write the `.plist` file, we
need to help them with serializing their efforts.

The 150ms time-out value was determined from thin air.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:16:57 -07:00
f6a5af0f62 t9001: PATH must not use Windows-style paths
On Windows, $(pwd) returns a drive-letter style path C:/foo, while $PWD
contains a POSIX style /c/foo path. When we want to interpolate the
current directory in the PATH variable, we must not use the C:/foo style,
because the meaning of the colon is ambiguous. Use the POSIX style.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:13:45 -07:00
bd72824c60 t5582: remove spurious 'cd "$D"' line
The variable D is never defined in test t5582, more severely the test
fails if D is defined by something outside the test suite, so remove
this spurious line.

Signed-off-by: Mickey Endito <mickey.endito.2323@protonmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:10:19 -07:00
c51f8f94e5 submodule--helper: run update procedures from C
Add a new submodule--helper subcommand `run-update-procedure` that runs
the update procedure if the SHA1 of the submodule does not match what
the superproject expects.

This is an intermediate change that works towards total conversion of
`submodule update` from shell to C.

Specific error codes are returned so that the shell script calling the
subcommand can take a decision on the control flow, and preserve the
error messages across subsequent recursive calls of `cmd_update`.

This change is more focused on doing a faithful conversion, so for now we
are not too concerned with trying to reduce subprocess spawns.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:03:58 -07:00
917a54c017 Documentation: describe MIDX-based bitmaps
Update the technical documentation to describe the multi-pack bitmap
format. This patch merely introduces the new format, and describes its
high-level ideas. Git does not yet know how to read nor write these
multi-pack variants, and so the subsequent patches will:

  - Introduce code to interpret multi-pack bitmaps, according to this
    document.

  - Then, introduce code to write multi-pack bitmaps from the 'git
    multi-pack-index write' sub-command.

Finally, the implementation will gain tests in subsequent patches (as
opposed to inline with the patch teaching Git how to write multi-pack
bitmaps) to avoid a cyclic dependency.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 13:21:13 -07:00
1d7f7f242c pack-bitmap-write.c: free existing bitmaps
When writing a new bitmap, the bitmap writer code attempts to read the
existing bitmap (if one is present). This is done in order to quickly
permute the bits of any bitmaps for commits which appear in the existing
bitmap, and were also selected for the new bitmap.

But since this code was added in 341fa34887 (pack-bitmap-write: use
existing bitmaps, 2020-12-08), the resources associated with opening an
existing bitmap were never released.

It's fine to ignore this, but it's bad hygiene. It will also cause a
problem for the multi-pack-index builtin, which will be responsible not
only for writing bitmaps, but also for expiring any old multi-pack
bitmaps.

If an existing bitmap was reused here, it will also be expired. That
will cause a problem on platforms which require file resources to be
closed before unlinking them, like Windows. Avoid this by ensuring we
close reused bitmaps with free_bitmap_index() before removing them.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 13:21:13 -07:00
3ba3d0621b pack-bitmap-write.c: gracefully fail to write non-closed bitmaps
The set of objects covered by a bitmap must be closed under
reachability, since it must be the case that there is a valid bit
position assigned for every possible reachable object (otherwise the
bitmaps would be incomplete).

Pack bitmaps are never written from 'git repack' unless repacking
all-into-one, and so we never write non-closed bitmaps (except in the
case of partial clones where we aren't guaranteed to have all objects).

But multi-pack bitmaps change this, since it isn't known whether the
set of objects in the MIDX is closed under reachability until walking
them. Plumb through a bit that is set when a reachable object isn't
found.

As soon as a reachable object isn't found in the set of objects to
include in the bitmap, bitmap_writer_build() knows that the set is not
closed, and so it now fails gracefully.

A test is added in t0410 to trigger a bitmap write without full
reachability closure by removing local copies of some reachable objects
from a promisor remote.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 13:21:13 -07:00
fa95666a40 pack-bitmap.c: harden 'test_bitmap_walk()' to check type bitmaps
The special `--test-bitmap` mode of `git rev-list` is used to compare
the result of an object traversal with a bitmap to check its integrity.
This mode does not, however, assert that the types of reachable objects
are stored correctly.

Harden this mode by teaching it to also check that each time an object's
bit is marked, the corresponding bit should be set in exactly one of the
type bitmaps (whose type matches the object's true type).

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 13:21:13 -07:00
6643503722 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Fix git-po-helper breakage
2021-08-24 21:43:18 +08:00
03d9fd1f6e l10n: sv.po: Fix git-po-helper breakage
Fixes errors introduced in commit efedbb11de when running
git-po-helper.

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-08-24 13:39:41 +01:00
f172556b89 cherry-pick: use better advice message
"git cherry-pick", upon seeing a conflict, says:

hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'

as if running "git commit" to conclude the resolution of
this single step were the end of the story.  This stems from
the fact that the command originally was to pick a single
commit and not a range of commits, and the message was
written back then and has not been adjusted.

When picking a range of commits and the command stops with a
conflict in the middle of the range, however, after
resolving the conflict and (optionally) recording the result
with "git commit", the user has to run "git cherry-pick
--continue" to have the rest of the range dealt with,
"--skip" to drop the current commit, or "--abort" to discard
the series.

Suggest use of "git cherry-pick --continue/--skip/--abort"
so that the message also covers the case where a range of
commits are being picked.

Similarly, this optimization can be applied to git revert,
suggest use of "git revert --continue/--skip/--abort" so
that the message also covers the case where a range of
commits are being reverted.

It is worth mentioning that now we use advice() to print
the content of GIT_CHERRY_PICK_HELP in print_advice(), each
line of output will start with "hint: ".

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-23 13:09:37 -07:00
f2563c9ef3 rebase -r: fix merge -c with a merge strategy
If a rebase is started with a --strategy option other than "ort" or
"recursive" then "merge -c" does not allow the user to reword the
commit message.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-23 09:36:30 -07:00
baf8ec8d3a rebase -r: don't write .git/MERGE_MSG when fast-forwarding
When fast-forwarding we do not create a new commit so .git/MERGE_MSG
is not removed and can end up seeding the message of a commit made
after the rebase has finished. Avoid writing .git/MERGE_MSG when we
are fast-forwarding by writing the file after the fast-forward
checks. Note that there are no changes to the fast-forward code, it is
simply moved.

Note that the way this change is implemented means we no longer write
the author script when fast-forwarding either. I believe this is safe
for the reasons below but it is a departure from what we do when
fast-forwarding a non-merge commit. If we reword the merge then 'git
commit --amend' will keep the authorship of the commit we're rewording
as it ignores GIT_AUTHOR_* unless --reset-author is passed. It will
also export the correct GIT_AUTHOR_* variables to any hooks and we
already test the authorship of the reworded commit. If we are not
rewording then we no longer call spilt_ident() which means we are no
longer checking the commit author header looks sane. However this is
what we already do when fast-forwarding non-merge commits in
skip_unnecessary_picks() so I don't think we're breaking any promises
by not checking the author here.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-23 09:36:30 -07:00
0c164ae7a6 rebase -i: add another reword test
None of the existing reword tests check that there are no uncommitted
changes when the editor is opened. Reuse the editor script from the
last commit to fix this omission.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-23 09:36:26 -07:00
d786523932 l10n: bg.po: Updated Bulgarian translation (5230t)
Reduce warnings reported by git-po-helper to 23 which look like false
positives.  Since Bulgarian uses Cyrillic we mark substitutable parts
in capitals rather than quoting with <> which saves two chars per term
(2.5% real estate on a 80 char line) plus it is safer and simpler as
the <> chars have special (sometimes destructive) meaning for the
shells.  It is also easier for the user to parse.  See for example
the message:
  git mailinfo [<options>] <msg> <patch> < mail >info
where both <> characters serve multiple functions.

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2021-08-23 09:18:31 +02:00
2be6b6f411 rebase -r: make 'merge -c' behave like reword
If the user runs git log while rewording a commit it is confusing if
sometimes we're amending the commit that's being reworded and at other
times we're creating a new commit depending on whether we could
fast-forward or not[1]. For this reason the reword command ensures
that there are no uncommitted changes when rewording. The reword
command also allows the user to edit the todo list while the rebase is
paused. As 'merge -c' also rewords commits make it behave like reword
and add a test.

[1] https://lore.kernel.org/git/xmqqlfvu4be3.fsf@gitster-ct.c.googlers.com/T/#m133009cb91cf0917bcf667300f061178be56680a

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-20 12:54:36 -07:00
212631ed50 refs/files: remove unused REF_DELETING in lock_ref_oid_basic()
The lock_ref_oid_basic() function has gradually been replaced by
most callers no longer performing a low-level "acquire lock,
update and release", and instead using the ref transaction API.
So there are only 4 remaining callers of lock_ref_oid_basic().

None of those callers pass REF_DELETING anymore, the last caller went
away in 92b1551b1d (refs: resolve symbolic refs first,
2016-04-25).

Before that we'd refactored and moved this code in:

 - 8df4e51138 (struct ref_update: move "have_old" into "flags",
   2015-02-17)

 - 7bd9bcf372 (refs: split filesystem-based refs code into a new
   file, 2015-11-09)

 - 165056b2fc (lock_ref_for_update(): new function, 2016-04-24)

We then finally stopped using it in 92b1551b1d (noted above). So let's
remove the handling of this parameter.

By itself this change doesn't benefit us much, but it's the start of
even more removal of unused code in and around this function in
subsequent commits.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-19 19:06:38 -07:00
881aebffcf refs/packet: add missing BUG() invocations to reflog callbacks
In e0cc8ac820 (packed_ref_store: make class into a subclass of
`ref_store`, 2017-06-23) a die() was added to packed_create_reflog(),
but not to any of the other reflog callbacks, let's do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-19 19:06:38 -07:00
325b06deda Makefile: remove archives before manipulating them with 'ar'
The rules creating the $(LIB_FILE) and $(XDIFF_LIB) archives used to
be:

  $(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^

until commit 7b76d6bf22 (Makefile: add and use the ".DELETE_ON_ERROR"
flag, 2021-06-29) removed the '$(RM) $@' part, claiming that "we can
rely on the "c" (create) being present in ARFLAGS", and (I presume)
assuming that it means that the named archive is created from scratch.

Unfortunately, that's not what the 'c' flag does, it merely "Suppress
the diagnostic message that is written to standard error by default
when the archive is created" [1].  Consequently, all object files that
are already present in an existing archive and are not replaced will
remain there.  This leads to linker errors in back-to-back builds of
different revisions without a 'make clean' between them if source
files going into these archives are renamed in between:

  # The last commit renaming files that go into 'libgit.a':
  # bc62692757 (hash-lookup: rename from sha1-lookup, 2020-12-31)
  #  sha1-lookup.c => hash-lookup.c | 14 +++++++-------
  #  sha1-lookup.h => hash-lookup.h | 12 ++++++------
  $ git checkout bc62692757^
  HEAD is now at 7a7d992d0d sha1-lookup: rename `sha1_pos()` as `hash_pos()`
  $ make
  [...]
  $ git checkout 7b76d6bf22
  HEAD is now at 7b76d6bf22 Makefile: add and use the ".DELETE_ON_ERROR" flag
  $ make
  [...]
      AR libgit.a
      LINK git
  /usr/bin/ld: libgit.a(hash-lookup.o): in function `bsearch_hash':
  /home/szeder/src/git/hash-lookup.c:105: multiple definition of `bsearch_hash'; libgit.a(sha1-lookup.o):/home/szeder/src/git/sha1-lookup.c:105: first defined here
  collect2: error: ld returned 1 exit status
  make: *** [Makefile:2213: git] Error 1

Restore the original make rules to first remove $(LIB_FILE) and
$(XDIFF_LIB) and then create them from scratch to avoid these build
errors.

[1] Quoting POSIX at:
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ar.html

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-19 16:44:32 -07:00
ff7b83f562 completion: tcsh: Fix regression by drop of wrapper functions
The cleanup of old compat wrappers in bash completion caused a
regression on tcsh completion that still uses them.
Let's update the tcsh call site as well for addressing it.

Fixes: 441ecdab37 ("completion: bash: remove old compat wrappers")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-18 13:58:27 -07:00
be6444d1ca completion: bash: add correct suffix in variables
__gitcomp automatically adds a suffix, but __gitcomp_nl and others
don't, we need to specify a space by default.

Can be tested with:

  git config branch.autoSetupMe<tab>

This fix only works for versions of bash greater than 4.0, before that
"local sfx" creates an empty string, therefore the unset expansion
doesn't work. The same happens in zsh.

Therefore we don't add the test for that for now.

The correct fix for all shells requires semantic changes in __gitcomp,
but that can be done later.

Cc: SZEDER Gábor <szeder.dev@gmail.com>
Tested-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-18 11:17:26 -07:00
f3cc916acc completion: bash: fix for multiple dash commands
Otherwise options of commands like 'for-each-ref' are not completed.

Tested-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-18 11:17:25 -07:00
e9f2118ddf completion: bash: fix for suboptions with value
We need to ignore options that don't start with -- as well.

Depending on the value of COMP_WORDBREAKS the last word could be
duplicated otherwise.

Can be tested with:

  git merge -X diff-algorithm=<tab>

Tested-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-18 11:17:25 -07:00
bf8ae49a8f completion: bash: fix prefix detection in branch.*
Otherwise we are completely ignoring the --cur argument.

The issue can be tested with:

  git clone --config=branch.<tab>

Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Tested-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-18 11:17:25 -07:00
7116fd3556 Merge branch 'new-mail' of github.com:git-l10n-pt-PT/git-po
* 'new-mail' of github.com:git-l10n-pt-PT/git-po:
  l10n: pt_PT: change email
  l10n: pt_PT: update translation table
2021-08-18 09:30:35 +08:00
225bc32a98 Git 2.33
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-16 12:15:44 -07:00
b06a5047ee Merge branch 'rs/oidtree-alignment-fix'
Codepath to access recently added oidtree data structure had
to make unaligned accesses to oids, which has been corrected.

* rs/oidtree-alignment-fix:
  oidtree: avoid unaligned access to crit-bit tree
2021-08-16 12:14:35 -07:00
f7cd3c0832 Merge tag 'l10n-2.33.0-rnd2' of git://github.com/git-l10n/git-po
l10n-2.33.0-rnd2

* tag 'l10n-2.33.0-rnd2' of git://github.com/git-l10n/git-po: (46 commits)
  l10n: sv.po: Update Swedish translation (5230t0f0u)
  l10n: TEAMS: change Simplified Chinese team leader
  l10n: tr: v2.33 (round 2)
  l10n: es: 2.33.0 round 2
  l10n: zh_CN: for git v2.33.0 l10n round 2
  l10n: zh_CN: Revision for git v2.32.0 l10n round 1
  l10n: README: refactor to use GFM syntax
  l10n: update German translation for Git v2.33.0 (rnd2)
  l10n: pt_PT: v2.33.0 round 2
  l10n: pt_PT: git-po-helper update
  l10n: pt_PT: update translation table
  l10n: zh_TW.po: remove the obsolete glossary
  l10n: vi.po(5230t): Updated translation for v2.32.0 round 2
  l10n: fr.po v2.33 rnd 2
  l10n: id: po-id for 2.33.0 round 2
  l10n: zh_TW.po: update for v2.33.0 rnd 2
  l10n: git.pot: v2.33.0 round 2 (11 new, 8 removed)
  l10n: de.po: fix typos
  l10n: update German translation for Git v2.33.0
  l10n: fr.po fix typos in commands and variables
  ...
2021-08-16 09:38:57 -07:00
b94aff54d1 l10n: pt_PT: change email
* change Portuguese team leader email

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-16 16:13:38 +01:00
5e9a5e5cd4 l10n: pt_PT: update translation table
* update translation table

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-16 12:58:57 +01:00
efedbb11de l10n: sv.po: Update Swedish translation (5230t0f0u)
Also fixed some typos reported by "git-po-helper".

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-08-16 06:54:20 +08:00
cfeae5a31e l10n: TEAMS: change Simplified Chinese team leader
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-08-16 06:39:30 +08:00
8bcda98da5 oidtree: avoid unaligned access to crit-bit tree
The flexible array member "k" of struct cb_node is used to store the key
of the crit-bit tree node.  It offers no alignment guarantees -- in fact
the current struct layout puts it one byte after a 4-byte aligned
address, i.e. guaranteed to be misaligned.

oidtree uses a struct object_id as cb_node key.  Since cf0983213c (hash:
add an algo member to struct object_id, 2021-04-26) it requires 4-byte
alignment.  The mismatch is reported by UndefinedBehaviorSanitizer at
runtime like this:

hash.h:277:2: runtime error: member access within misaligned address 0x00015000802d for type 'struct object_id', which requires 4 byte alignment
0x00015000802d: note: pointer points here
 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00
             ^
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior hash.h:277:2 in

We can fix that by:

1. eliminating the alignment requirement of struct object_id,
2. providing the alignment in struct cb_node, or
3. avoiding the issue by only using memcpy to access "k".

Currently we only store one of two values in "algo" in struct object_id.
We could use a uint8_t for that instead and widen it only once we add
support for our twohundredth algorithm or so.  That would not only avoid
alignment issues, but also reduce the memory requirements for each
instance of struct object_id by ca. 9%.

Supporting keys with alignment requirements might be useful to spread
the use of crit-bit trees.  It can be achieved by using a wider type for
"k" (e.g. uintmax_t), using different types for the members "byte" and
"otherbits" (e.g. uint16_t or uint32_t for each), or by avoiding the use
of flexible arrays like khash.h does.

This patch implements the third option, though, because it has the least
potential for causing side-effects and we're close to the next release.
If one of the other options is implemented later as well to get their
additional benefits we can get rid of the extra copies introduced here.

Reported-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-15 13:13:50 -07:00
3cf9bb36bf ci: use upload-artifacts v1 for dockerized jobs
e9f79acb28 (ci: upgrade to using actions/{up,down}load-artifacts v2,
2021-06-23) changed all calls to that action from v1 to v2, but there
is still an open bug[1] that affects all nodejs actions and prevents
its use in 32-bit linux (as used by the Linux32 container)

move all dockerized jobs to use v1 that was built in C# and therefore
doesn't have this problem, which will otherwise manifest with confusing
messages like:

  /usr/bin/docker exec  0285adacc4536b7cd962079c46f85fa05a71e66d7905b5e4b9b1a0e8b305722a sh -c "cat /etc/*release | grep ^ID"
  OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: no such file or directory: unknown

[1] https://github.com/actions/runner/issues/1011

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-15 09:45:38 -07:00
8ef6aad664 commit: restore --edit when combined with --fixup
Recent changes to --fixup, adding amend suboption, caused the
--edit flag to be ignored as use_editor was always set to zero.

Restore edit_flag having higher priority than fixup_message when
deciding the value of use_editor by moving the edit flag condition
later in the method.

Signed-off-by: Joel Klinghed <the_jk@spawned.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-15 09:44:08 -07:00
289946016e Merge branch 'next' of github.com:ChrisADR/git-po
* 'next' of github.com:ChrisADR/git-po:
  l10n: es: 2.33.0 round 2
2021-08-15 18:32:20 +08:00
8f333b5f66 l10n: tr: v2.33 (round 2)
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-08-15 10:17:15 +03:00
92c199fa6d l10n: es: 2.33.0 round 2
Signed-off-by: Christopher Diaz Riveros <christopher.diaz.riv@gmail.com>
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Javier Spagnoletti phansys@gmail.com
Signed-off-by: Cleydyr Albuquerque <cleydyr@gmail.com>
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Guillermo Ramos <gramosg>
2021-08-14 23:28:05 -05:00
ec3d4607ae l10n: zh_CN: for git v2.33.0 l10n round 2
Translate 48 new messages (5230t0f0u) for git 2.33.0, and also fixed
typos found by "git-po-helper".

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Fangyi Zhou <me@fangyi.io>
2021-08-15 11:16:07 +08:00
523ccf5dba l10n: zh_CN: Revision for git v2.32.0 l10n round 1
Signed-off-by: Fangyi Zhou <me@fangyi.io>
2021-08-15 10:56:49 +08:00
cb92e28384 l10n: README: refactor to use GFM syntax
Format README.md using GFM (GitHub Flavored Markdown) syntax.

- In order to use more than 3 level headings, use ATX style headings
  instead of setext style headings.

- In order to add highlights for code blocks, use fenced code blocks
  instead of indented code blocks.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-08-15 10:55:16 +08:00
fb2aacea67 Merge branch 'l10n-2.33-rnd2' of github.com:ralfth/git
* 'l10n-2.33-rnd2' of github.com:ralfth/git:
  l10n: update German translation for Git v2.33.0 (rnd2)
2021-08-15 10:26:18 +08:00
813147b681 Merge branch 'pt-PT' of github.com:git-l10n-pt-PT/git-po
* 'pt-PT' of github.com:git-l10n-pt-PT/git-po:
  l10n: pt_PT: v2.33.0 round 2
  l10n: pt_PT: git-po-helper update
  l10n: pt_PT: update translation table
2021-08-15 10:24:24 +08:00
dc66e3c799 help.c: help.autocorrect=prompt waits for user action
If help.autocorrect is set to 'prompt', the user is prompted
before the suggested action is executed.

Based on original patch by David Barr
https://lore.kernel.org/git/1283758030-13345-1-git-send-email-david.barr@cordelta.com/

Signed-off-by: Azeem Bande-Ali <me@azeemba.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-14 11:20:49 -07:00
4e7e75353a l10n: update German translation for Git v2.33.0 (rnd2)
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2021-08-14 16:35:44 +02:00
fd3e513e88 l10n: pt_PT: v2.33.0 round 2
* translation of new entries

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-14 15:13:11 +01:00
7d3bc0806b l10n: pt_PT: git-po-helper update
* run git-po-helper update pt_PT.po

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-14 15:13:11 +01:00
88e7a93391 l10n: pt_PT: update translation table
* updated translation table

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-14 15:13:04 +01:00
309f8cf9f9 Merge branch 'loc/zh_TW/210814' of github.com:l10n-tw/git-po
* 'loc/zh_TW/210814' of github.com:l10n-tw/git-po:
  l10n: zh_TW.po: remove the obsolete glossary
  l10n: zh_TW.po: update for v2.33.0 rnd 2
2021-08-14 19:30:54 +08:00
508b35770d l10n: zh_TW.po: remove the obsolete glossary
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2021-08-14 19:07:58 +08:00
37eabbe498 Merge branch 'master' of github.com:vnwildman/git
* 'master' of github.com:vnwildman/git:
  l10n: vi.po(5230t): Updated translation for v2.32.0 round 2
2021-08-14 17:02:54 +08:00
9b272daa50 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: id: po-id for 2.33.0 round 2
2021-08-14 17:01:27 +08:00
86e24f5b12 l10n: vi.po(5230t): Updated translation for v2.32.0 round 2
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2021-08-14 14:54:44 +07:00
0934645bf8 l10n: fr.po v2.33 rnd 2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-08-14 08:28:48 +02:00
2640b6a3a6 l10n: id: po-id for 2.33.0 round 2
Update translation for following component:
  * builtin/submodule--helper.c

Translate following new component:
  * builtin/revert.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-08-14 13:17:13 +07:00
81e30fc0c7 l10n: zh_TW.po: update for v2.33.0 rnd 2
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2021-08-14 14:01:59 +08:00
55095d03a9 Merge branch 'master' of github.com:vnwildman/git
* 'master' of github.com:vnwildman/git:
  l10n: vi.po(5227t): Fixed typo after run git-po-helper
2021-08-14 11:52:34 +08:00
bbc7dccd9d l10n: git.pot: v2.33.0 round 2 (11 new, 8 removed)
Generate po/git.pot from v2.33.0-rc2 for git v2.33.0 l10n round 2.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-08-14 07:57:34 +08:00
117e2caa42 Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git: (51 commits)
  Git 2.33-rc2
  object-file: use unsigned arithmetic with bit mask
  Revert 'diff-merges: let "-m" imply "-p"'
  object-store: avoid extra ';' from KHASH_INIT
  oidtree: avoid nested struct oidtree_node
  Git 2.33-rc1
  test: fix for COLUMNS and bash 5
  The eighth batch
  diff: --pickaxe-all typofix
  mingw: align symlinks-related rmdir() behavior with Linux
  t7508: avoid non POSIX BRE
  use fspathhash() everywhere
  t0001: fix broken not-quite getcwd(3) test in bed67874e2
  Documentation: render special characters correctly
  reset: clear_unpack_trees_porcelain to plug leak
  builtin/rebase: fix options.strategy memory lifecycle
  builtin/merge: free found_ref when done
  builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
  convert: release strbuf to avoid leak
  read-cache: call diff_setup_done to avoid leak
  ...
2021-08-14 07:56:22 +08:00
18c9fced7b Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5227t0f0u)
2021-08-14 07:55:01 +08:00
738cdce5b2 Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5227t)
2021-08-14 07:54:38 +08:00
37a65ae0c9 Merge branch 'l10n-2.33' of github.com:ralfth/git
* 'l10n-2.33' of github.com:ralfth/git:
  l10n: de.po: fix typos
  l10n: update German translation for Git v2.33.0
2021-08-14 07:50:33 +08:00
1dcaad9726 Merge branch 'fr_fix_typos' of github.com:jnavila/git
* 'fr_fix_typos' of github.com:jnavila/git:
  l10n: fr.po fix typos in commands and variables
2021-08-14 07:45:37 +08:00
14eb1841a6 Merge branch 'master' of github.com:Softcatala/git-po
* 'master' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2021-08-14 07:44:41 +08:00
e5ee33e855 rebase --continue: remove .git/MERGE_MSG
If the user skips the final commit by removing all the changes from
the index and worktree with 'git restore' (or read-tree) and then runs
'git rebase --continue' .git/MERGE_MSG is left behind. This will seed
the commit message the next time the user commits which is not what we
want to happen.

Reported-by: Victor Gambier <vgambier@excilys.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-13 11:36:22 -07:00
bed9b4e312 rebase --apply: restore some tests
980b482d28 ("rebase tests: mark tests specific to the am-backend with
--am", 2020-02-15) sought to prepare tests testing the "apply" backend
in preparation for 2ac0d6273f ("rebase: change the default backend
from "am" to "merge"", 2020-02-15). However some tests seem to have
been missed leading to us testing the "merge" backend twice. This
patch fixes some cases that I noticed while adding tests to these
files, I have not audited all the other rebase test files. I've
reworded a couple of the test descriptions to make it clear which
backend they are testing.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-13 11:36:22 -07:00
118ee5c613 t3403: fix commit authorship
Setting GIT_AUTHOR_* when committing with --amend will only change the
author if we also pass --reset-author.  This commit is used in some
tests that ensure the author ident does not change when rebasing.
Creating this commit without changing the authorship meant that the
test would not catch regressions that caused rebase to discard the
original authorship information.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-13 11:36:22 -07:00
a90b519187 l10n: de.po: fix typos
Fix some typos found by `./git-po-helper check-po po/de.po`.

Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2021-08-13 20:22:48 +02:00
d4c5a0c81b l10n: update German translation for Git v2.33.0
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2021-08-13 20:22:43 +02:00
ad51ae4dc0 ci: update freebsd 12 cirrus job
make sure it uses a supported OS branch and uses all the resources
that can be allocated efficiently.

while only 1GB of memory is needed, 2GB is the minimum for a 2 CPU
machine (the default), but by increasing parallelism wall time has
been reduced by 35%.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-12 14:00:52 -07:00
b3e36df0d4 list-objects.c: rename "traverse_trees_and_blobs" to "traverse_non_commits"
Function `traverse_trees_and_blobs` not only works on trees and blobs,
but also on tags, the function name is somewhat misleading. This commit
rename it to `traverse_non_commits`.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-12 13:08:30 -07:00
95862fa2fa l10n: fr.po fix typos in commands and variables
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-08-12 21:14:15 +02:00
ea75369306 l10n: id: mismatch variable name fixes
Jiang Xin reported possible typos in po/id.po, all of them are mismatch
variable names. Fix them.

Reported-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-08-12 17:18:29 +07:00
2be328e4df l10n: vi.po(5227t): Fixed typo after run git-po-helper
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2021-08-12 14:05:39 +07:00
b50be84244 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-08-12 08:07:32 +02:00
b227bead4d t5607: avoid using prerequisites to select algorithm
In this test, we currently use the SHA1 prerequisite to specify the
algorithm we're using to test, since SHA-256 bundles are always v3,
whereas SHA-1 bundles default to v2, and as a result the default output
differs.

However, this causes a problem if we run with GIT_TEST_FAIL_PREREQS set,
since that means that we'll unexpectedly fail the SHA1 prerequisite,
resulting in incorrect expected output.  Let's fix this by checking
against the built-in data called "algo", which tells us which algorithm
is in use.  This should work in any situation, making our test a little
more robust.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-11 22:24:32 -07:00
41a03e285f Merge branch 'daniel' of github.com:git-l10n-pt-PT/git-po
* 'daniel' of github.com:git-l10n-pt-PT/git-po:
  l10n: pt_PT: cleaning flags mismatch
  l10n: pt_PT: cleaning duplicate translations
  l10n: pt_PT: update translation tables
  l10n: pt_PT: translated git v2.33.0
  l10n: pt_PT: update git-po-helper
  l10n: pt_PT: remove trailing comments
  l10n: pt_PT: translation tables
  l10n: pt_PT: add Portuguese translations part 5
  l10n: pt_PT: add Portuguese translations part 4
2021-08-12 08:23:55 +08:00
5d213e46bb Git 2.33-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-11 12:36:18 -07:00
4c90d8908a Merge branch 'jn/log-m-does-not-imply-p'
Earlier "git log -m" was changed to always produce patch output,
which would break existing scripts, which has been reverted.

* jn/log-m-does-not-imply-p:
  Revert 'diff-merges: let "-m" imply "-p"'
2021-08-11 12:36:18 -07:00
7cfaa86fe6 Merge branch 'cb/many-alternate-optim-fixup'
Build fix.

* cb/many-alternate-optim-fixup:
  object-file: use unsigned arithmetic with bit mask
  object-store: avoid extra ';' from KHASH_INIT
  oidtree: avoid nested struct oidtree_node
2021-08-11 12:36:17 -07:00
cebead1ebf ci: run a pedantic build as part of the GitHub workflow
similar to the recently added sparse task, it is nice to know as early
as possible.

add a dockerized build using fedora (that usually has the latest gcc)
to be ahead of the curve and avoid older ISO C issues at the same time.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-11 11:25:06 -07:00
a8cbc89589 userdiff: improve java hunk header regex
Currently, the git diff hunk headers show the wrong method signature if the
method has a qualified return type, an array return type, or a generic return
type because the regex doesn't allow dots (.), [], or < and > in the return
type.  Also, type parameter declarations couldn't be matched.

Add several t4018 tests asserting the right hunk headers for different cases:

  - enum constant change
  - change in generic method with bounded type parameters
  - change in generic method with wildcard
  - field change in a nested class

Signed-off-by: Tassilo Horn <tsdh@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-11 11:11:30 -07:00
b6029b3279 userdiff: comment on the builtin patterns
Remind developers that they do not need to go overboard to implement
patterns to prepare for invalid constructs.  They only have to be
sufficiently permissive, assuming that the payload is syntactically
correct, and that may allow them to be simpler.

Text stolen mostly from, and further improved by, Johannes Sixt.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-11 11:02:59 -07:00
581a3bb155 object-file: use unsigned arithmetic with bit mask
33f379eee6 (make object_directory.loose_objects_subdir_seen a bitmap,
2021-07-07) replaced a wasteful 256-byte array with a 32-byte array
and bit operations.  The mask calculation shifts a literal 1 of type
int left by anything between 0 and 31.  UndefinedBehaviorSanitizer
doesn't like that and reports:

object-file.c:2477:18: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'

Make sure to use an unsigned 1 instead to avoid the issue.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-11 10:19:56 -07:00
e83b1243bd l10n: pt_PT: cleaning flags mismatch
* corrected git flags mismatch

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-11 05:07:01 +01:00
716f68ec33 Merge branch 'ds/add-with-sparse-index' into ds/sparse-index-ignored-files
* ds/add-with-sparse-index:
  add: remove ensure_full_index() with --renormalize
  add: ignore outside the sparse-checkout in refresh()
  pathspec: stop calling ensure_full_index
  add: allow operating on a sparse-only index
  t1092: test merge conflicts outside cone
2021-08-10 13:39:14 -07:00
626beebdf8 connect, protocol: log negotiated protocol version
It is useful for performance monitoring and debugging purposes to know
the wire protocol used for remote operations. This may differ from the
version set in local configuration due to differences in version and/or
configuration between the server and the client. Therefore, log the
negotiated wire protocol version via trace2, for both clients and
servers.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:46:33 -07:00
de0fcbe0f4 submodule--helper: rename compute_submodule_clone_url()
Let's rename 'compute_submodule_clone_url()' to 'resolve_relative_url()'
to make it clear that this internal helper need not be used exclusively
for computing submodule clone URLs.

Since the original 'resolve-relative-url' subcommand and its C entry
point has been removed in c461095ae3 (submodule--helper: remove
resolve-relative-url subcommand, 2021-07-02), this rename can be done
without causing any confusion about which function it actually binds to.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:12 -07:00
15fe88d5a6 submodule--helper: remove resolve-relative-url subcommand
The shell subcommand `resolve-relative-url` is no longer required, as
its last caller has been removed when it was converted to C.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:12 -07:00
ba8a3b019e submodule--helper: remove add-config subcommand
Also no longer needed is this subcommand, as all of its functionality is
being called by the newly-introduced `module_add()` directly within C.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:12 -07:00
f006132c24 submodule--helper: remove add-clone subcommand
We no longer need this subcommand, as all of its functionality is being
called by the newly-introduced `module_add()` directly within C.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
a6226fd772 submodule--helper: convert the bulk of cmd_add() to C
Introduce the 'add' subcommand to `submodule--helper.c` that does all
the work 'submodule add' past the parsing of flags.

We also remove the constness of the sm_path field of the `add_data`
struct. This is needed so that it can be modified by
normalize_path_copy().

As with the previous conversions, this is meant to be a faithful
conversion with no modification to the behaviour of `submodule add`.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
ed86301f68 dir: libify and export helper functions from clone.c
These functions can be useful to other parts of Git. Let's move them to
dir.c, while renaming them to be make their functionality more explicit.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
0c61041ed6 submodule--helper: remove repeated code in sync_submodule()
This part of `sync_submodule()` is doing the same thing that
`compute_submodule_clone_url()` is doing. Let's reuse that helper here.

Note that this change adds a small overhead where we allocate and free
the 'remote' twice, but that is a small price to pay for the higher
level of abstraction we get.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
ab6f23b751 submodule--helper: refactor resolve_relative_url() helper
Refactor the helper function to resolve a relative url, by reusing the
existing `compute_submodule_clone_url()` function.

`compute_submodule_clone_url()` performs the same work that
`resolve_relative_url()` is doing, so we eliminate this code repetition
by moving the former function's definition up, and calling it inside
`resolve_relative_url()`.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
6baf4e4da4 submodule--helper: add options for compute_submodule_clone_url()
Let's modify the interface to `compute_submodule_clone_url()` function
by adding two more arguments, so that we can reuse this in various parts
of `submodule--helper.c` that follow a common pattern, which is--read
the remote url configuration of the superproject and then call
`relative_url()`.

This function is nearly identical to `resolve_relative_url()`, the only
difference being the extra warning message. We can add a quiet flag to
it, to suppress that warning when not needed, and then refactor
`resolve_relative_url()` by using this function, something we will do in
the next patch.

We also rename the local variable 'relurl' to avoid potential confusion
with the 'rel_url' parameter while we are at it.

Having this functionality factored out will be useful for converting the
rest of `submodule add` in subsequent patches.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
46d723ce57 apply: keep buffer/size pair in sync when parsing binary hunks
We parse through binary hunks by looping through the buffer with code
like:

    llen = linelen(buffer, size);

    ...do something with the line...

    buffer += llen;
    size -= llen;

However, before we enter the loop, there is one call that increments
"buffer" but forgets to decrement "size". As a result, our "size" is off
by the length of that line, and subsequent calls to linelen() may look
past the end of the buffer for a newline.

The fix is easy: we just need to decrement size as we do elsewhere.

This bug goes all the way back to 0660626caf (binary diff: further
updates., 2006-05-05). Presumably nobody noticed because it only
triggers if the patch is corrupted, and even then we are often "saved"
by luck. We use a strbuf to store the incoming patch, so we overallocate
there, plus we add a 16-byte run of NULs as slop for memory comparisons.
So if this happened accidentally, the common case is that we'd just read
a few uninitialized bytes from the end of the strbuf before producing
the expected "this patch is corrupted" error complaint.

However, it is possible to carefully construct a case which reads off
the end of the buffer. The included test does so. It will pass both
before and after this patch when run normally, but using a tool like
ASan shows that we get an out-of-bounds read before this patch, but not
after.

Reported-by: Xingman Chen <xichixingman@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:38:13 -07:00
c4d5907324 range-diff: use ssize_t for parsed "len" in read_patches()
As we iterate through the buffer containing git-log output, parsing
lines, we use an "int" to store the size of an individual line. This
should be a size_t, as we have no guarantee that there is not a
malicious 2GB+ commit-message line in the output.

Overflowing this integer probably doesn't do anything _too_ terrible. We
are not using the value to size a buffer, so the worst case is probably
an out-of-bounds read from before the array. But it's easy enough to
fix.

Note that we have to use ssize_t here, since we also store the length
result from parse_git_diff_header(), which may return a negative value
for error. That function actually returns an int itself, which has a
similar overflow problem, but I'll leave that for another day. Much
of the apply.c code uses ints and should be converted as a whole; in the
meantime, a negative return from parse_git_diff_header() will be
interpreted as an error, and we'll bail (so we can't handle such a case,
but given that it's likely to be malicious anyway, the important thing
is we don't have any memory errors).

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:37:36 -07:00
7c86d365da range-diff: handle unterminated lines in read_patches()
When parsing our buffer of output from git-log, we have a
find_end_of_line() helper that finds the next newline, and gives us the
number of bytes to move past it, or the size of the whole remaining
buffer if there is no newline.

But trying to handle both those cases leads to some oddities:

  - we try to overwrite the newline with NUL in the caller, by writing
    over line[len-1]. This is at best redundant, since the helper will
    already have done so if it saw a newline. But if it didn't see a
    newline, it's actively wrong; we'll overwrite the byte at the end of
    the (unterminated) line.

    We could solve this just dropping the extra NUL assignment in the
    caller and just letting the helper do the right thing. But...

  - if we see a "diff --git" line, we'll restore the newline on top of
    the NUL byte, so we can pass the string to parse_git_diff_header().
    But if there was no newline in the first place, we can't do this.
    There's no place to put it (the current code writes a newline
    over whatever byte we obliterated earlier). The best we can do is
    feed the complete remainder of the buffer to the function (which is,
    in fact, a string, by virtue of being a strbuf).

To solve this, the caller needs to know whether we actually found a
newline or not. We could modify find_end_of_line() to return that
information, but we can further observe that it has only one caller.
So let's just inline it in that caller.

Nobody seems to have noticed this case, probably because git-log would
never produce input that doesn't end with a newline. Arguably we could
just return an error as soon as we see that the output does not end in a
newline. But the code to do so actually ends up _longer_, mostly because
of the cleanup we have to do in handling the error.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:37:36 -07:00
47ac23d314 range-diff: drop useless "offset" variable from read_patches()
The "offset" variable was was introduced in 44b67cb62b (range-diff:
split lines manually, 2019-07-11), but it has never done anything
useful. We use it to count up the number of bytes we've consumed, but we
never look at the result. It was probably copied accidentally from an
almost-identical loop in apply.c:find_header() (and the point of that
commit was to make use of the parse_git_diff_header() function which
underlies both).

Because the variable was set but not used, most compilers didn't seem to
notice, but the upcoming clang-14 does complain about it, via its
-Wunused-but-set-variable warning.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:37:36 -07:00
db0aa64208 l10n: pt_PT: cleaning duplicate translations
* cleaning duplicate incorrect translations part 1

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-10 19:32:01 +01:00
ffbb3ee955 l10n: pt_PT: update translation tables
* update translation tables

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-10 19:32:01 +01:00
d0c637e6b9 l10n: pt_PT: translated git v2.33.0
* translated new entries of git v2.33.0

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-10 19:31:49 +01:00
59dcbb810c Merge branch 'ar/submodule-add-config' into ar/submodule-add
* ar/submodule-add-config:
  submodule--helper: introduce add-config subcommand
2021-08-10 11:01:19 -07:00
a452128a36 submodule--helper: introduce add-config subcommand
Add a new "add-config" subcommand to `git submodule--helper` with the
goal of converting part of the shell code in git-submodule.sh related to
`git submodule add` into C code. This new subcommand sets the
configuration variables of a newly added submodule, by registering the
url in local git config, as well as the submodule name and path in the
.gitmodules file. It also sets 'submodule.<name>.active' to "true" if
the submodule path has not already been covered by any pathspec
specified in 'submodule.active'.

This is meant to be a faithful conversion from shell to C, although we
add comments to areas that could be improved in future patches, after
the conversion has settled.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 10:57:57 -07:00
6a38e33331 Revert 'diff-merges: let "-m" imply "-p"'
This reverts commit f5bfcc823b, which
made "git log -m" imply "--patch" by default.  The logic was that
"-m", which makes diff generation for merges perform a diff against
each parent, has no use unless I am viewing the diff, so we could save
the user some typing by turning on display of the resulting diff
automatically.  That wasn't expected to adversely affect scripts
because scripts would either be using a command like "git diff-tree"
that already emits diffs by default or would be combining -m with a
diff generation option such as --name-status.  By saving typing for
interactive use without adversely affecting scripts in the wild, it
would be a pure improvement.

The problem is that although diff generation options are only relevant
for the displayed diff, a script author can imagine them affecting
path limiting.  For example, I might run

	git log -w --format=%H -- README

hoping to list commits that edited README, excluding whitespace-only
changes.  In fact, a whitespace-only change is not TREESAME so the use
of -w here has no effect (since we don't apply these diff generation
flags to the diff_options struct rev_info::pruning used for this
purpose), but the documentation suggests that it should work

	Suppose you specified foo as the <paths>. We shall call
	commits that modify foo !TREESAME, and the rest TREESAME. (In
	a diff filtered for foo, they look different and equal,
	respectively.)

and a script author who has not tested whitespace-only changes
wouldn't notice.

Similarly, a script author could include

	git log -m --first-parent --format=%H -- README

to filter the first-parent history for commits that modified README.
The -m is a no-op but it reflects the script author's intent.  For
example, until 1e20a407fe (stash list: stop passing "-m" to "git
log", 2021-05-21), "git stash list" did this.

As a result, we can't safely change "-m" to imply "-p" without fear of
breaking such scripts.  Restore the previous behavior.

Noticed because Rust's src/bootstrap/bootstrap.py made use of this
same construct: https://github.com/rust-lang/rust/pull/87513.  That
script has been updated to omit the unnecessary "-m" option, but we
can expect other scripts in the wild to have similar expectations.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 13:52:01 -07:00
9d66d5eae4 l10n: sv.po: Update Swedish translation (5227t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-08-09 20:38:56 +01:00
f559d6d45e revision: avoid hitting packfiles when commits are in commit-graph
When queueing references in git-rev-list(1), we try to optimize parsing
of commits via the commit-graph. To do so, we first look up the object's
type, and if it is a commit we call `repo_parse_commit()` instead of
`parse_object()`. This is quite inefficient though given that we're
always uncompressing the object header in order to determine the type.
Instead, we can opportunistically search the commit-graph for the object
ID: in case it's found, we know it's a commit and can directly fill in
the commit object without having to uncompress the object header.

Expose a new function `lookup_commit_in_graph()`, which tries to find a
commit in the commit-graph by ID, and convert `get_reference()` to use
this function. This provides a big performance win in cases where we
load references in a repository with lots of references pointing to
commits. The following has been executed in a real-world repository with
about 2.2 million refs:

    Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
      Time (mean ± σ):      4.458 s ±  0.044 s    [User: 4.115 s, System: 0.342 s]
      Range (min … max):    4.409 s …  4.534 s    10 runs

    Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
      Time (mean ± σ):      3.089 s ±  0.015 s    [User: 2.768 s, System: 0.321 s]
      Range (min … max):    3.061 s …  3.105 s    10 runs

    Summary
      'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran
        1.44 ± 0.02 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 09:51:12 -07:00
809ea28f80 commit-graph: split out function to search commit position
The function `find_commit_in_graph()` assumes that the caller has passed
an object which was already determined to be a commit given that it will
access the commit's graph position, which is stored in a commit slab. In
a subsequent patch, we want to search for an object ID though without
knowing whether it is a commit or not, which is not currently possible.

Split out the logic to search the commit graph for a given object ID to
prepare for this change. This commit also renames the function to
`find_commit_pos_in_graph()`, which more accurately reflects what this
function does. Furthermore, in order to allow for the searched object ID
to be const, we need to adjust `bsearch_graph()`'s signature to accept a
constant object ID as input, too.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 09:51:12 -07:00
bf9c0cbddb revision: stop retrieving reference twice
When queueing up references for the revision walk, `handle_one_ref()`
will resolve the reference's object ID via `get_reference()` and then
queue the ID as pending object via `add_pending_oid()`. But given that
`add_pending_oid()` is only a thin wrapper around `add_pending_object()`
which fist calls `get_reference()`, we effectively resolve the reference
twice and thus duplicate some of the work.

Fix the issue by instead calling `add_pending_object()` directly, which
takes the already-resolved object as input. In a repository with lots of
refs, this translates into a near 10% speedup:

    Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
      Time (mean ± σ):      5.015 s ±  0.038 s    [User: 4.698 s, System: 0.316 s]
      Range (min … max):    4.970 s …  5.089 s    10 runs

    Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
      Time (mean ± σ):      4.606 s ±  0.029 s    [User: 4.260 s, System: 0.345 s]
      Range (min … max):    4.565 s …  4.657 s    10 runs

    Summary
      'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran
        1.09 ± 0.01 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 09:51:12 -07:00
f45022dc2f connected: do not sort input revisions
In order to compute whether objects reachable from a set of tips are all
connected, we do a revision walk with these tips as positive references
and `--not --all`. `--not --all` will cause the revision walk to load
all preexisting references as uninteresting, which can be very expensive
in repositories with many references.

Benchmarking the git-rev-list(1) command highlights that by far the most
expensive single phase is initial sorting of the input revisions: after
all references have been loaded, we first sort commits by author date.
In a real-world repository with about 2.2 million references, it makes
up about 40% of the total runtime of git-rev-list(1).

Ultimately, the connectivity check shouldn't really bother about the
order of input revisions at all. We only care whether we can actually
walk all objects until we hit the cut-off point. So sorting the input is
a complete waste of time.

Introduce a new "--unsorted-input" flag to git-rev-list(1) which will
cause it to not sort the commits and adjust the connectivity check to
always pass the flag. This results in the following speedups, executed
in a clone of gitlab-org/gitlab [1]:

    Benchmark #1: git rev-list  --objects --quiet --not --all --not $(cat newrev)
      Time (mean ± σ):      7.639 s ±  0.065 s    [User: 7.304 s, System: 0.335 s]
      Range (min … max):    7.543 s …  7.742 s    10 runs

    Benchmark #2: git rev-list --unsorted-input --objects --quiet --not --all --not $newrev
      Time (mean ± σ):      4.995 s ±  0.044 s    [User: 4.657 s, System: 0.337 s]
      Range (min … max):    4.909 s …  5.048 s    10 runs

    Summary
      'git rev-list --unsorted-input --objects --quiet --not --all --not $(cat newrev)' ran
        1.53 ± 0.02 times faster than 'git rev-list  --objects --quiet --not --all --not $newrev'

[1]: https://gitlab.com/gitlab-org/gitlab.git. Note that not all refs
     are visible to clients.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 09:51:12 -07:00
00e302da76 builtin/merge: avoid -Wformat-extra-args from ancient Xcode
d540b70c85 (merge: cleanup messages like commit, 2019-04-17) adds
a way to change part of the helper text using a single call to
strbuf_add_commented_addf but with two formats with varying number
of parameters.

this trigger a warning in old versions of Xcode (ex 8.0), so use
instead two independent calls with a matching number of parameters

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 09:48:01 -07:00
dd3c8a72a2 object-store: avoid extra ';' from KHASH_INIT
cf2dc1c238 (speed up alt_odb_usable() with many alternates, 2021-07-07)
introduces a KHASH_INIT invocation with a trailing ';', which while
commonly expected will trigger warnings with pedantic on both
clang[-Wextra-semi] and gcc[-Wpedantic], because that macro has already
a semicolon and is meant to be invoked without one.

while fixing the macro would be a worthy solution (specially considering
this is a common recurring problem), remove the extra ';' for now to
minimize churn.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 09:01:30 -07:00
14825944d7 oidtree: avoid nested struct oidtree_node
92d8ed8ac1 (oidtree: a crit-bit tree for odb_loose_cache, 2021-07-07)
adds a struct oidtree_node that contains only an n field with a
struct cb_node.

unfortunately, while building in pedantic mode witch clang 12 (as well
as a similar error from gcc 11) it will show:

  oidtree.c:11:17: error: 'n' may not be nested in a struct due to flexible array member [-Werror,-Wflexible-array-extensions]
          struct cb_node n;
                         ^

because of a constrain coded in ISO C 11 6.7.2.1¶3 that forbids using
structs that contain a flexible array as part of another struct.

use a strict cb_node directly instead.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 09:01:30 -07:00
e84f865118 l10n: vi.po(5227t): Updated Vietnamese translation for v2.32.0
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2021-08-09 07:58:57 +07:00
f32c5d3716 build: catch clang that identifies itself as "$VENDOR clang"
The case statement in detect-compiler notices 'clang', 'FreeBSD
clang' and 'Apple clang', but there are other platforms that follow
the '$VENDOR clang' pattern (e.g. Debian).

Generalize the pattern to catch them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-06 13:35:37 -07:00
33f13ad7c5 build: clang version may not be followed by extra words
The get_family and get_version helpers of detect-compiler assume
that the line to identify the version from the compilers have a
token "version", followed by the version number, followed by some
other string, e.g.

  $ CC=gcc get_version_line
  gcc version 10.2.1 20210110 (Debian 10.2.1-6)

But that is not necessarily true, e.g.

  $ CC=clang get_version_line
  Debian clang version 11.0.1-2

Tweak the script not to require extra string after the version.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-06 13:30:24 -07:00
f6bb2099bf build: update detect-compiler for newer Xcode version
1da1580e4c (Makefile: detect compiler and enable more warnings in
DEVELOPER=1, 2018-04-14) uses the output of the compiler banner to
detect the compiler family.

Apple had since changed the wording used to refer to its compiler
as clang instead of LLVM as shown by:

  $ cc --version
  Apple clang version 12.0.5 (clang-1205.0.22.9)
  Target: x86_64-apple-darwin20.6.0
  Thread model: posix
  InstalledDir: /Library/Developer/CommandLineTools/usr/bin

so update the script to match, and allow DEVELOPER=1 to work as
expected again in macOS.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-06 13:29:00 -07:00
2d755dfac9 Git 2.33-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-06 12:53:06 -07:00
aa7d2fe355 Merge branch 'cb/t7508-regexp-fix'
* cb/t7508-regexp-fix:
  t7508: avoid non POSIX BRE
2021-08-06 12:52:22 -07:00
55194925e6 Merge branch 'ab/pickaxe-pcre2'
* ab/pickaxe-pcre2:
  diff: --pickaxe-all typofix
2021-08-06 12:52:15 -07:00
c87977a0c5 Merge branch 'fc/disable-checkwinsize'
* fc/disable-checkwinsize:
  test: fix for COLUMNS and bash 5
2021-08-06 12:50:26 -07:00
390b44eb2b test: fix for COLUMNS and bash 5
Since c49a177bec (test-lib.sh: set COLUMNS=80 for --verbose
repeatability, 2021-06-29) multiple tests have been failing when using
bash 5 because checkwinsize is enabled by default, therefore COLUMNS is
reset using TIOCGWINSZ even for non-interactive shells.

It's debatable whether or not bash should even be doing that, but for
now we can avoid this undesirable behavior by disabling this option.

Reported-by: Fabian Stelzer <fabian.stelzer@campoint.net>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
[jc: with SZEDER Gábor's suggestion to do this before setting COLUMNS]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-06 09:59:55 -07:00
12515dc49e l10n: bg.po: Updated Bulgarian translation (5227t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2021-08-06 13:40:56 +03:00
5fee9b925e Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: id: po-id for 2.33.0 (round 1)
2021-08-06 08:00:45 +08:00
f5a3c5e637 Update docs for change of default merge backend
Make it clear that `ort` is the default merge strategy now rather than
`recursive`, including moving `ort` to the front of the list of merge
strategies.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 15:35:02 -07:00
6a5fb96672 Change default merge backend from recursive to ort
There are a few reasons to switch the default:
  * Correctness
  * Extensibility
  * Performance

I'll provide some summaries about each.

=== Correctness ===

The original impetus for a new merge backend was to fix issues that were
difficult to fix within recursive's design.  The success with this goal
is perhaps most easily demonstrated by running the following:

  $ git grep -2 KNOWN_FAILURE t/ | grep -A 4 GIT_TEST_MERGE_ALGORITHM
  $ git grep test_expect_merge_algorithm.failure.success t/
  $ git grep test_expect_merge_algorithm.success.failure t/

In order, these greps show:

  * Seven sets of submodule tests (10 total tests) that fail with
    recursive but succeed with ort
  * 22 other tests that fail with recursive, but succeed with ort
  * 0 tests that pass with recursive, but fail with ort

=== Extensibility ===

Being able to perform merges without touching the working tree or index
makes it possible to create new features that were difficult with the
old backend:

  * Merging, cherry-picking, rebasing, reverting in bare repositories...
    or just on branches that aren't checked out.

  * `git diff AUTO_MERGE` -- ability to see what changes the user has
    made to resolve conflicts so far (see commit 5291828df8 ("merge-ort:
    write $GIT_DIR/AUTO_MERGE whenever we hit a conflict", 2021-03-20)

  * A --remerge-diff option for log/show, used to show diffs for merges
    that display the difference between what an automatic merge would
    have created and what was recorded in the merge.  (This option will
    often result in an empty diff because many merges are clean, but for
    the non-clean ones it will show how conflicts were fixed including
    the removal of conflict markers, and also show additional changes
    made outside of conflict regions to e.g. fix semantic conflicts.)

  * A --remerge-diff-only option for log/show, similar to --remerge-diff
    but also showing how cherry-picks or reverts differed from what an
    automatic cherry-pick or revert would provide.

The last three have been implemented already (though only one has been
submitted upstream so far; the others were waiting for performance work
to complete), and I still plan to implement the first one.

=== Performance ===

I'll quote from the summary of my final optimization for merge-ort
(while fixing the testcase name from 'no-renames' to 'few-renames'):

                               Timings

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename    merge-ort
                 v2.30.0      current     detection     current
                ----------   ---------   -----------   ---------
few-renames:      18.912 s    18.030 s     11.699 s     198.3 ms
mega-renames:   5964.031 s   361.281 s    203.886 s     661.8 ms
just-one-mega:   149.583 s    11.009 s      7.553 s     264.6 ms

                           Speedup factors

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
few-renames:        1           1.05         1.6           95
mega-renames:       1          16.5         29           9012
just-one-mega:      1          13.6         20            565

And, for partial clone users:

             Factor reduction in number of objects needed

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
mega-renames:       1            1            1          181.3

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 15:35:02 -07:00
29ef1f27fe revision: separate walk and unsorted flags
The `--no-walk` flag supports two modes: either it sorts the revisions
given as input input or it doesn't. This is reflected in a single
`no_walk` flag, which reflects one of the three states "walk", "don't
walk but without sorting" and "don't walk but with sorting".

Split up the flag into two separate bits, one indicating whether we
should walk or not and one indicating whether the input should be sorted
or not. This will allow us to more easily introduce a new flag
`--unsorted-input`, which only impacts the sorting bit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 09:37:28 -07:00
899062438f Makefile: normalize clobbering & xargs for tags targets
Since the "tags", "TAGS" and "cscope.out" targets rely on piping into
xargs with an "echo <list> | xargs" pattern, we need to make sure
we're in an append mode.

Unlike my recent change to make use of ".DELETE_ON_ERROR" in
7b76d6bf22 (Makefile: add and use the ".DELETE_ON_ERROR" flag,
2021-06-29), we really do need the "rm $@+" at the beginning (note,
not "rm $@").

This is because the xargs command may decide to invoke the program
multiple times. We need to make sure we've got a union of its results
at the end.

For "ctags" and "etags" we used the "-a" flag for this, for cscope
that behavior is the default. Its "-u" flag disables its equivalent of
an implicit "-a" flag.

Let's also consistently use the $@ and $@+ names instead of needlessly
hardcoding or referring to more verbose names in the "tags" and "TAGS"
rules.

These targets could perhaps be improved in the future by factoring
this "echo <list> | xargs" pattern so that we make intermediate tags
files for each source file, and then assemble them into one "tags"
file at the end.

The etags manual page suggests that doing that (or perhaps just
--update) might be counter-productive, in any case, the tag building
is fast enough for me, so I'm leaving that for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 09:31:15 -07:00
530a446d4a Makefile: remove "cscope.out", not "cscope*" in cscope.out target
Before we generate a "cscope.out" file, remove that file explicitly,
and not everything matching "cscope*". This doesn't change any
behavior of the Makefile in practice, but makes this rule less
confusing, and consistent with other similar rules.

The cscope target was added in a2a9150bf0 (makefile: Add a cscope
target, 2007-10-06). It has always referred to cscope* instead of to
cscope.out in .gitignore and the "clean" target, even though we only
ever generated a cscope.out file.

This was seemingly done to aid use-cases where someone invoked cscope
with the "-q" flag, which would make it create a "cscope.in.out" and
"cscope.po.out" files in addition to "cscope.out".

But us removing those files we never generated is confusing, so let's
only remove the file we need to, furthermore let's use the "-f" flag
to explicitly name the cscope.out file, even though it's the default
if not "-f" argument is supplied.

It is somewhat inconsistent to change from the glob here but not in
the "clean" rule and .gitignore, an earlier version of this change
updated those as well, but see [1][2] for why they were kept.

1. https://lore.kernel.org/git/87k0lit57x.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/87im0kn983.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 09:31:15 -07:00
98e2d9d6f7 upload-pack: document and rename --advertise-refs
The --advertise-refs documentation in git-upload-pack added in
9812f2136b (upload-pack.c: use parse-options API, 2016-05-31) hasn't
been entirely true ever since v2 support was implemented in
e52449b672 (connect: request remote refs using v2, 2018-03-15). Under
v2 we don't advertise the refs at all, but rather dump the
capabilities header.

This option has always been an obscure internal implementation detail,
it wasn't even documented for git-receive-pack. Since it has exactly
one user let's rename it to --http-backend-info-refs, which is more
accurate and points the reader in the right direction. Let's also
cross-link this from the protocol v1 and v2 documentation.

I'm retaining a hidden --advertise-refs alias in case there's any
external users of this, and making both options hidden to the bash
completion (as with most other internal-only options).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
f234da8019 serve.[ch]: remove "serve_options", split up --advertise-refs code
The "advertise capabilities" mode of serve.c added in
ed10cb952d (serve: introduce git-serve, 2018-03-15) is only used by
the http-backend.c to call {upload,receive}-pack with the
--advertise-refs parameter. See 42526b478e (Add stateless RPC options
to upload-pack, receive-pack, 2009-10-30).

Let's just make cmd_upload_pack() take the two (v2) or three (v2)
parameters the the v2/v1 servicing functions need directly, and pass
those in via the function signature. The logic of whether daemon mode
is implied by the timeout belongs in the v1 function (only used
there).

Once we split up the "advertise v2 refs" from "serve v2 request" it
becomes clear that v2 never cared about those in combination. The only
time it mattered was for v1 to emit its ref advertisement, in that
case we wanted to emit the smart-http-only "no-done" capability.

Since we only do that in the --advertise-refs codepath let's just have
it set "do_done" itself in v1's upload_pack() just before send_ref(),
at that point --advertise-refs and --stateless-rpc in combination are
redundant (the only user is get_info_refs() in http-backend.c), so we
can just pass in --advertise-refs only.

Since we need to touch all the serve() and advertise_capabilities()
codepaths let's rename them to less clever and obvious names, it's
been suggested numerous times, the latest of which is [1]'s suggestion
for protocol_v2_serve_loop(). Let's go with that.

1. https://lore.kernel.org/git/CAFQ2z_NyGb8rju5CKzmo6KhZXD0Dp21u-BbyCb2aNxLEoSPRJw@mail.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
760486a1f7 {upload,receive}-pack tests: add --advertise-refs tests
The --advertise-refs option had no explicit tests of its own, only
other http tests that would fail at a distance if it it was
broken. Let's test its behavior explicitly.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
5befe8a1f1 serve.c: move version line to advertise_capabilities()
The advertise_capabilities() is only called from serve() and we always
emit this version line before it. In a subsequent commit I'll make
builtin/upload-pack.c sometimes call advertise_capabilities()
directly, so it'll make sense to have this line emitted by
advertise_capabilities(), not serve() itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
eea7f7a977 serve: move transfer.advertiseSID check into session_id_advertise()
In 6b5b6e422e (serve: advertise session ID in v2 capabilities,
2020-11-11) the check for transfer.advertiseSID was added to the
beginning of the main serve() loop. Thus on startup of the server we'd
populate it.

Let's instead use an explicit lazy initialization pattern in
session_id_advertise() itself, we'll still look the config up only
once per-process, but by moving it out of serve() itself the further
changing of that routine becomes easier.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
28a592e4f4 serve.[ch]: don't pass "struct strvec *keys" to commands
The serve.c API added in ed10cb952d (serve: introduce git-serve,
2018-03-15) was passing in the raw capabilities "keys", but nothing
downstream of it ever used them.

Let's remove that code because it's not needed. If we do end up
needing to pass information about the advertisement in the future
it'll make more sense to have serve.c parse the capabilities keys and
pass the result of its parsing, rather than expecting expecting its
API users to parse the same keys again.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
85baaed475 serve: use designated initializers
Change the declaration of the protocol_capability struct to use
designated initializers, this makes this more verbose now, but a
follow-up commit will add a new field. At that point these lines would
be too dense to be on one line comfortably.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
1fd88224a3 transport: use designated initializers
Change the assignments to the various transport_vtables to use
designated initializers, this makes the code easier to read and
maintain.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
9b1cdd334d transport: rename "fetch" in transport_vtable to "fetch_refs"
Rename the "fetch" member of the transport_vtable to "fetch_refs" for
consistency with the existing "push_refs". Neither of them just push
"refs" but refs and objects, but having the two match makes the code
more readable than having it be inconsistent, especially since
"fetch_refs" is a lot easier to grep for than "fetch".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:36 -07:00
eacf36a4d1 serve: mark has_capability() as static
The has_capability() function introduced in ed10cb952d (serve:
introduce git-serve, 2018-03-15) has never been used anywhere except
serve.c, so let's mark it as static.

It was later changed from "extern" in 554544276a (*.[ch]: remove
extern from function declarations using spatch, 2019-04-29), but we
could have simply marked it as "static" instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:36 -07:00
81483fe613 Update error message and code comment
There were two locations in the code that referred to 'merge-recursive'
but which were also applicable to 'merge-ort'.  Update them to more
general wording.

Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:40 -07:00
67feccd3ba merge-strategies.txt: add coverage of the ort merge strategy
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:40 -07:00
6320813bc0 git-rebase.txt: correct out-of-date and misleading text about renames
Commit 58634dbff8 ("rebase: Allow merge strategies to be used when
rebasing", 2006-06-21) added the --merge option to git-rebase so that
renames could be detected (at least when using the `recursive` merge
backend).  However, git-am -3 gained that same ability in commit
579c9bb198 ("Use merge-recursive in git-am -3.", 2006-12-28).  As such,
the comment about being able to detect renames is not particularly
noteworthy.  Remove it.

Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:40 -07:00
b36ade216c merge-strategies.txt: fix simple capitalization error
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:40 -07:00
4d15c85556 merge-strategies.txt: avoid giving special preference to patience algorithm
We already have diff-algorithm that explains why there are special diff
algorithms, so we do not need to re-explain patience.  patience exists
as its own toplevel option for historical reasons, but there's no reason
to give it special preference or document it again and suggest it's more
important than other diff algorithms, so just refer to it as a
deprecated shorthand for `diff-algorithm=patience`.

Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:40 -07:00
002a6dfc7c merge-strategies.txt: do not imply using copy detection is desired
Stating that the recursive strategy "currently cannot make use of
detected copies" implies that this is a technical shortcoming of the
current algorithm.  I disagree with that.  I don't see how copies could
possibly be used in a sane fashion in a merge algorithm -- would we
propagate changes in one file on one side of history to each copy of
that file when merging?  That makes no sense to me.  I cannot think of
anything else that would make sense either.  Change the wording to
simply state that we ignore any copies.

Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:40 -07:00
510415ecc9 merge-strategies.txt: update wording for the resolve strategy
It is probably helpful to cover the default merge strategy first, so
move the text for the resolve strategy to later in the document.

Further, the wording for "resolve" claimed that it was "considered
generally safe and fast", which might imply in some readers minds that
the same is not true of other strategies.  Rather than adding this text
to all the strategies, just remove it from this one.

Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:39 -07:00
e80178eac6 Documentation: edit awkward references to git merge-recursive
A few places in the documentation referred to the "`recursive` strategy"
using the phrase "`git merge-recursive`", suggesting that it was forking
subprocesses to call a toplevel builtin.  Perhaps that was relevant to
when rebase was a shell script, but it seems like a rather indirect way
to refer to the `recursive` strategy.  Simplify the references.

Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:39 -07:00
b378df72ed directory-rename-detection.txt: small updates due to merge-ort optimizations
In commit 0c4fd732f0 ("Move computation of dir_rename_count from
merge-ort to diffcore-rename", 2021-02-27), much of the logic for
computing directory renames moved into diffcore-rename.
directory-rename-detection.txt had claims that all of that logic was
found in merge-recursive.  Update the documentation.

Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:39 -07:00
e037c2e418 git-rebase.txt: correct antiquated claims about --rebase-merges
When --rebase-merges was first introduced, it only worked with the
`recursive` strategy.  Some time later, it gained support for merges
using the `octopus` strategy.  The limitation of only supporting these
two strategies was documented in 25cff9f109 ("rebase -i --rebase-merges:
add a section to the man page", 2018-04-25) and lifted in e145d99347
("rebase -r: support merge strategies other than `recursive`",
2019-07-31).  However, when the limitation was lifted, the documentation
was not updated.  Update it now.

Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:39 -07:00
01758866ba l10n: pt_PT: update git-po-helper
Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-05 14:05:11 +01:00
cece123d71 l10n: pt_PT: remove trailing comments
* removed all unecessary trailing file comments

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-05 14:05:11 +01:00
2a68d66127 l10n: pt_PT: translation tables
* filled translation table
 * add other translation table helper

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-05 14:05:04 +01:00
310dc40996 l10n: id: po-id for 2.33.0 (round 1)
Translate following new components:
  * builtin/show-branch.c
  * builtin/show-index.c
  * builtin/show-ref.c
  * builtin/shortlog.c
  * builtin/describe.c
  * bisect.c
  * builtin/bisect--helper.c
  * blame.c
  * builtin/blame.c
  * grep.c
  * builtin/grep.c
  * builtin/diff-tree.c
  * builtin/diff.c
  * help.c

Update translation for following components:
  * diff.c
  * builtin/clone.c
  * builtin/fetch.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-08-05 18:08:57 +07:00
fac4caefa7 Merge branch 'tr-loc-v2.33' of github.com:bitigchi/git-po
* 'tr-loc-v2.33' of github.com:bitigchi/git-po:
  l10n: tr: v2.33.0 round 1
2021-08-05 12:56:29 +08:00
87c67efc0c l10n: tr: v2.33.0 round 1
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-08-05 07:45:52 +03:00
518bb518d1 Merge branch 'fr_v2.33_rnd1' of github.com:jnavila/git
* 'fr_v2.33_rnd1' of github.com:jnavila/git:
  l10n: fr.po v2.33 rnd 1
  l10n: fr: fix typo
2021-08-05 08:36:31 +08:00
e5a14ddd2d The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-04 13:28:56 -07:00
099a64aa39 Merge branch 'tb/mingw-rmdir-symlink-to-directory'
Windows rmdir() equivalent behaves differently from POSIX ones in
that when used on a symbolic link that points at a directory, the
target directory gets removed, which has been corrected.

* tb/mingw-rmdir-symlink-to-directory:
  mingw: align symlinks-related rmdir() behavior with Linux
2021-08-04 13:28:56 -07:00
dfbbe8bd49 Merge branch 'ar/doc-markup-fix'
Doc mark-up fix.

* ar/doc-markup-fix:
  Documentation: render special characters correctly
2021-08-04 13:28:55 -07:00
fea3738ac5 Merge branch 'ab/getcwd-test'
Portability test update.

* ab/getcwd-test:
  t0001: fix broken not-quite getcwd(3) test in bed67874e2
2021-08-04 13:28:55 -07:00
4dc964691f Merge branch 'rs/use-fspathhash'
Code simplification.

* rs/use-fspathhash:
  use fspathhash() everywhere
2021-08-04 13:28:54 -07:00
5fef3b15db Merge branch 'pb/merge-autostash-more'
The local changes stashed by "git merge --autostash" were lost when
the merge failed in certain ways, which has been corrected.

* pb/merge-autostash-more:
  merge: apply autostash if merge strategy fails
  merge: apply autostash if fast-forward fails
  Documentation: define 'MERGE_AUTOSTASH'
  merge: add missing word "strategy" to a message
2021-08-04 13:28:54 -07:00
1a6fb019d6 Merge branch 'en/ort-perf-batch-14'
Further optimization on "merge -sort" backend.

* en/ort-perf-batch-14:
  merge-ort: restart merge with cached renames to reduce process entry cost
  merge-ort: avoid recursing into directories when we don't need to
  merge-ort: defer recursing into directories when merge base is matched
  merge-ort: add a handle_deferred_entries() helper function
  merge-ort: add data structures for allowable trivial directory resolves
  merge-ort: add some more explanations in collect_merge_info_callback()
  merge-ort: resolve paths early when we have sufficient information
2021-08-04 13:28:54 -07:00
506d2a354a Merge branch 'ds/commit-and-checkout-with-sparse-index'
"git checkout" and "git commit" learn to work without unnecessarily
expanding sparse indexes.

* ds/commit-and-checkout-with-sparse-index:
  unpack-trees: resolve sparse-directory/file conflicts
  t1092: document bad 'git checkout' behavior
  checkout: stop expanding sparse indexes
  sparse-index: recompute cache-tree
  commit: integrate with sparse-index
  p2000: compress repo names
  p2000: add 'git checkout -' test and decrease depth
2021-08-04 13:28:53 -07:00
58705b4903 Merge branch 'ab/update-submitting-patches'
Reorganize and update the SubmitingPatches document.

* ab/update-submitting-patches:
  SubmittingPatches: replace discussion of Travis with GitHub Actions
  SubmittingPatches: move discussion of Signed-off-by above "send"
2021-08-04 13:28:53 -07:00
31f9acf9ce Merge branch 'ah/plugleaks'
Leak plugging.

* ah/plugleaks:
  reset: clear_unpack_trees_porcelain to plug leak
  builtin/rebase: fix options.strategy memory lifecycle
  builtin/merge: free found_ref when done
  builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
  convert: release strbuf to avoid leak
  read-cache: call diff_setup_done to avoid leak
  ref-filter: also free head for ATOM_HEAD to avoid leak
  diffcore-rename: move old_dir/new_dir definition to plug leak
  builtin/for-each-repo: remove unnecessary argv copy to plug leak
  builtin/submodule--helper: release unused strbuf to avoid leak
  environment: move strbuf into block to plug leak
  fmt-merge-msg: free newly allocated temporary strings when done
2021-08-04 13:28:52 -07:00
10f57e0eb9 Merge branch 'ar/submodule-add'
Rewrite of "git submodule" in C continues.

* ar/submodule-add:
  submodule: drop unused sm_name parameter from show_fetch_remotes()
  submodule--helper: introduce add-clone subcommand
  submodule--helper: refactor module_clone()
  submodule: prefix die messages with 'fatal'
  t7400: test failure to add submodule in tracked path
2021-08-04 13:28:52 -07:00
3e5e6c6e94 fetch-pack: speed up loading of refs via commit graph
When doing reference negotiation, git-fetch-pack(1) is loading all refs
from disk in order to determine which commits it has in common with the
remote repository. This can be quite expensive in repositories with many
references though: in a real-world repository with around 2.2 million
refs, fetching a single commit by its ID takes around 44 seconds.

Dominating the loading time is decompression and parsing of the objects
which are referenced by commits. Given the fact that we only care about
commits (or tags which can be peeled to one) in this context, there is
thus an easy performance win by switching the parsing logic to make use
of the commit graph in case we have one available. Like this, we avoid
hitting the object database to parse these commits but instead only load
them from the commit-graph. This results in a significant performance
boost when executing git-fetch in said repository with 2.2 million refs:

    Benchmark #1: HEAD~: git fetch $remote $commit
      Time (mean ± σ):     44.168 s ±  0.341 s    [User: 42.985 s, System: 1.106 s]
      Range (min … max):   43.565 s … 44.577 s    10 runs

    Benchmark #2: HEAD: git fetch $remote $commit
      Time (mean ± σ):     19.498 s ±  0.724 s    [User: 18.751 s, System: 0.690 s]
      Range (min … max):   18.629 s … 20.454 s    10 runs

    Summary
      'HEAD: git fetch $remote $commit' ran
        2.27 ± 0.09 times faster than 'HEAD~: git fetch $remote $commit'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-04 10:45:32 -07:00
11c649b891 diff: --pickaxe-all typofix
When I was fixing fuzzies as I updating po/id.po for 2.33.0 l10n round,
I noticed a triple-dash typo (--pickaxe-all) at diff.c, which according
to git-diff(1) manpage, the correct option name should be --pickaxe-all.

Fix the typo.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-04 10:34:47 -07:00
62a15162fe merge-ort: remove compile-time ability to turn off usage of memory pools
Simplify code maintenance by removing the ability to toggle between
usage of memory pools and direct allocations.  This allows us to also
remove paths_to_free since it was solely about bookkeeping to make sure
we freed the necessary paths, and allows us to remove some auxiliary
functions.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-03 15:13:38 -07:00
c131aab043 l10n: fr.po v2.33 rnd 1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-08-03 22:13:07 +02:00
98bdfc0c1b l10n: fr: fix typo
Reported-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-08-03 21:06:47 +02:00
8d2cca9754 l10n: pt_PT: add Portuguese translations part 5
* git-po-helper update

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-03 13:20:18 +01:00
ea00f105a7 l10n: pt_PT: add Portuguese translations part 4
* Fixed some typos
* Transformed 'não' (no) into affirmative
* Substituted 'excerto' to 'pedaço'

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-08-03 13:20:15 +01:00
4ef987fab4 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-08-03 12:50:24 +02:00
d4df71b2b0 l10n: git.pot: v2.33.0 round 1 (38 new, 15 removed)
Generate po/git.pot from v2.33.0-rc0 for git v2.33.0 l10n round 1.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-08-03 17:06:56 +08:00
972c9cf6ae Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git: (397 commits)
  Git 2.33-rc0
  The seventh batch
  ci/install-dependencies: handle "sparse" job package installs
  ci: run "apt-get update" before "apt-get install"
  cache-tree: prefetch in partial clone read-tree
  unpack-trees: refactor prefetching code
  pack-bitmap: check pack validity when opening bitmap
  bundle tests: use test_cmp instead of grep
  bundle tests: use ">file" not ": >file"
  The sixth batch
  doc: pull: fix rebase=false documentation
  pack-bitmap: clarify comment in filter_bitmap_exclude_type()
  doc: clarify description of 'submodule.recurse'
  doc/git-config: simplify "override" advice for FILES section
  doc/git-config: clarify GIT_CONFIG environment variable
  doc/git-config: explain --file instead of referring to GIT_CONFIG
  t0000: fix test if run with TEST_OUTPUT_DIRECTORY
  multi-pack-index: fix potential segfault without sub-command
  refs/debug: quote prefix
  t0000: clear GIT_SKIP_TESTS before running sub-tests
  ...
2021-08-03 17:03:35 +08:00
3e7d4888e5 mingw: align symlinks-related rmdir() behavior with Linux
When performing a rebase, rmdir() is called on the folder .git/logs. On
Unix rmdir() exits without deleting anything in case .git/logs is a
symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir
and RemoveDirectoryW) do not behave the same and remove the folder if it
is symlinked even if it is not empty.

This creates issues when folders in .git/ are symlinks which is
especially the case when git-repo[1] is used: It replaces `.git/logs/`
with a symlink.

One such issue is that the _target_ of that symlink is removed e.g.
during a `git rebase`, where `delete_reflog("REBASE_HEAD")` will not
only try to remove `.git/logs/REBASE_HEAD` but then recursively try to
remove the parent directories until an error occurs, a technique that
obviously relies on `rmdir()` refusing to remove a symlink.

This was reported in https://github.com/git-for-windows/git/issues/2967.

This commit updates mingw_rmdir() so that its behavior is the same as
Linux rmdir() in case of symbolic links.

To verify that Git does not regress on the reported issue, this patch
adds a regression test for the `git rebase` symptom, even if the same
`rmdir()` behavior is quite likely to cause potential problems in other
Git commands as well.

[1]: git-repo is a python tool built on top of Git which helps manage
many Git repositories. It stores all the .git/ folders in a central
place by taking advantage of symbolic links.
More information: https://gerrit.googlesource.com/git-repo/

Signed-off-by: Thomas Bétous <tomspycell@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 15:10:58 -07:00
4da8b2fcd4 t7508: avoid non POSIX BRE
24c30e0b6 (wt-status: tolerate dangling marks, 2020-09-01) adds a test
that uses a BRE which breaks at least with OpenBSD's grep.

switch to an ERE as it is done for similar checks and while at it, remove
the now obsolete test_i18ngrep call.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 15:05:23 -07:00
1d9c8daef8 bundle doc: replace "basis" with "prerequsite(s)"
In the preceding commits we introduced new documentation that talks
about "[commit|object] prerequsite(s)", but also faithfully moved
around existing documentation that talks about the "basis".

Let's change both that moved-around documentation and other existing
documentation in the file to consistently use "[commit|object]"
prerequisite(s)" instead of talking about "basis". The mention of
"basis" isn't wrong, but readers will be helped by us using only one
term throughout the document for this concept.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 14:46:22 -07:00
5c8273d57c bundle doc: rewrite the "DESCRIPTION" section
Rewrite the "DESCRIPTION" section for "git bundle" to start by talking
about what bundles are in general terms, rather than diving directly
into one example of what they might be used for.

This changes documentation that's been substantially the same ever
since the command was added in 2e0afafebd (Add git-bundle: move
objects and references by archive, 2007-02-22).

I've split up the DESCRIPTION into that section and a "BUNDLE FORMAT"
section, it briefly discusses the format, but then links to the
technical/bundle-format.txt documentation.

The "the user must specify a basis" part of this is discussed below in
"SPECIFYING REFERENCES", and will be further elaborated on in a
subsequent commit. So I'm removing that part and letting the mention
of "revision exclusions" suffice.

There was a discussion about whether to say anything at all about
"thin packs" here[1]. I think it's good to mention it for the curious
reader willing to read the technical docs, but let's explicitly say
that there's no "thick pack", and that the difference shouldn't
matter.

1. http://lore.kernel.org/git/xmqqk0mbt5rj.fsf@gitster.g

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 14:46:21 -07:00
0bb92f3a3a bundle doc: elaborate on rev<->ref restriction
Elaborate on the restriction that you cannot provide a revision that
doesn't resolve to a reference in the "SPECIFYING REFERENCES" section
with examples.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 14:46:21 -07:00
9ab80dd6ae bundle doc: elaborate on object prerequisites
Split out the discussion bout "object prerequisites" into its own
section, and add some more examples of the common cases.

See 2e0afafebd (Add git-bundle: move objects and references by
archive, 2007-02-22) for the introduction of the documentation being
changed here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 14:46:21 -07:00
66262451ec Git 2.33-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 14:06:43 -07:00
9bcdaab13e Merge branch 'jk/check-pack-valid-before-opening-bitmap'
A race between repacking and using pack bitmaps has been corrected.

* jk/check-pack-valid-before-opening-bitmap:
  pack-bitmap: check pack validity when opening bitmap
2021-08-02 14:06:43 -07:00
8230107f33 Merge branch 'jt/bulk-prefetch'
"git read-tree" had a codepath where blobs are fetched one-by-one
from the promisor remote, which has been corrected to fetch in bulk.

* jt/bulk-prefetch:
  cache-tree: prefetch in partial clone read-tree
  unpack-trees: refactor prefetching code
2021-08-02 14:06:42 -07:00
e9fe413fc2 Merge branch 'fc/pull-no-rebase-merges-theirs-into-ours'
Documentation fix for "git pull --rebase=no".

* fc/pull-no-rebase-merges-theirs-into-ours:
  doc: pull: fix rebase=false documentation
2021-08-02 14:06:42 -07:00
107687b5af Merge branch 'ab/bundle-tests'
"git bundle" gained more test coverage.

* ab/bundle-tests:
  bundle tests: use test_cmp instead of grep
  bundle tests: use ">file" not ": >file"
2021-08-02 14:06:41 -07:00
e163f73b7b Merge branch 'ps/perf-with-separate-output-directory'
Test update.

* ps/perf-with-separate-output-directory:
  perf: fix when running with TEST_OUTPUT_DIRECTORY
2021-08-02 14:06:41 -07:00
8a49dfacd6 Merge branch 'js/ci-check-whitespace-updates'
CI update.

* js/ci-check-whitespace-updates:
  ci(check-whitespace): restrict to the intended commits
  ci(check-whitespace): stop requiring a read/write token
2021-08-02 14:06:40 -07:00
5a9b455146 Merge branch 'jk/config-env-doc'
Documentation around GIT_CONFIG has been updated.

* jk/config-env-doc:
  doc/git-config: simplify "override" advice for FILES section
  doc/git-config: clarify GIT_CONFIG environment variable
  doc/git-config: explain --file instead of referring to GIT_CONFIG
2021-08-02 14:06:40 -07:00
c01881845c Merge branch 'pb/submodule-recurse-doc'
Doc update.

* pb/submodule-recurse-doc:
  doc: clarify description of 'submodule.recurse'
2021-08-02 14:06:39 -07:00
9556aadd63 Merge branch 'tb/bitmap-type-filter-comment-fix'
In-code comment update.

* tb/bitmap-type-filter-comment-fix:
  pack-bitmap: clarify comment in filter_bitmap_exclude_type()
2021-08-02 14:06:38 -07:00
977f8acefd t6001: avoid direct file system access
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:21 -07:00
f95661b740 t6500: use "ls -1" to snapshot ref database state
By doing ls -1 .git/{reftable,refs/heads}, we can capture changes to both
reftable and packed/loose ref storage.

This relies on the fact that git-pack-refs (which we're looking for here)
changes the number (loose/packed storage) and/or names (reftable) files used for
ref storage.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:21 -07:00
2cf9f0fca1 t7064: use update-ref -d to remove upstream branch
The previous code tested this by writing $ZERO_OID explicitly in the packed-refs
file. This is a type of corruption that doesn't reflect realistic use-cases. In
addition, even the ref-store test-tool refuses to write invalid OIDs.
(update-ref interprets $ZERO_OID is deleting the ref).

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:21 -07:00
fe14431526 t1410: mark test as REFFILES
This test takes a lock on the target of a symref, and then verifies that it is
possible to expire the symref's reflog. In reftable, one can only take a global
lock (which would prevent the symref reflog from being expired altogether.)

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:21 -07:00
a50234b3be t1405: mark test for 'git pack-refs' as REFFILES
The tests verifies that "pack-refs" causes loose refs to be packed. As both
loose and packed refs are concepts specific to the files backend, mark the test
as REFFILES.

Check the outcome of the pack-refs operation. This was apparently forgotten in
the commit introducing this test: 16feb99d (Mar 26 2017, "t1405: some basic
tests on main ref store").

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:21 -07:00
ace40eab9e t1405: use 'git reflog exists' to check reflog existence
This fixes a test failure for reftable.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:21 -07:00
100ac47bf3 t2402: use ref-store test helper to create broken symlink
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:20 -07:00
2f566d665a t3320: use git-symbolic-ref rather than filesystem access
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:20 -07:00
e46775cf9e t6120: use git-update-ref rather than filesystem access
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:20 -07:00
5e93b90dea t1503: mark symlink test as REFFILES
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:20 -07:00
e6b0a8fab8 t6050: use git-update-ref rather than filesystem access
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 13:17:20 -07:00
8dda4cbdf2 http: rename CURLOPT_FILE to CURLOPT_WRITEDATA
The CURLOPT_FILE name is an alias for CURLOPT_WRITEDATA, the
CURLOPT_WRITEDATA name has been preferred since curl 7.9.7, released
in May 2002[1].

1. https://curl.se/libcurl/c/CURLOPT_WRITEDATA.html

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 16:01:54 -07:00
5db9d38359 http: drop support for curl < 7.19.3 and < 7.17.0 (again)
Remove the conditional use of CURLAUTH_DIGEST_IE and
CURLOPT_USE_SSL. These two have been split from earlier simpler checks
against LIBCURL_VERSION_NUM for ease of review.

According to

  https://github.com/curl/curl/blob/master/docs/libcurl/symbols-in-versions

the CURLAUTH_DIGEST_IE flag became available in 7.19.3, and
CURLOPT_USE_SSL in 7.17.0.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 16:00:10 -07:00
7431842325 use fspathhash() everywhere
cf2dc1c238 (speed up alt_odb_usable() with many alternates, 2021-07-07)
introduced the function fspathhash() for calculating path hashes while
respecting the configuration option core.ignorecase.  Call it instead of
open-coding it; the resulting code is shorter and less repetitive.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 12:14:27 -07:00
644de29e22 http: drop support for curl < 7.19.4
In the last commit we dropped support for curl < 7.16.0, let's
continue that and drop support for versions older than 7.19.3. This
allows us to simplify the code by getting rid of some "#ifdef"'s.

Git was broken with vanilla curl < 7.19.4 from v2.12.0 until
v2.15.0. Compiling with it was broken by using CURLPROTO_* outside any
"#ifdef" in aeae4db174 (http: create function to get curl allowed
protocols, 2016-12-14), and fixed in v2.15.0 in f18777ba6e (http: fix
handling of missing CURLPROTO_*, 2017-08-11).

It's unclear how much anyone was impacted by that in practice, since
as noted in [1] RHEL versions using curl older than that still
compiled, because RedHat backported some features. Perhaps other
vendors did the same.

Still, it's one datapoint indicating that it wasn't in active use at
the time. That (the v2.12.0 release) was in Feb 24, 2017, with v2.15.0
on Oct 30, 2017, it's now mid-2021.

1. http://lore.kernel.org/git/c8a2716d-76ac-735c-57f9-175ca3acbcb0@jupiterrise.com;
   followed-up by f18777ba6e (http: fix handling of missing CURLPROTO_*,
   2017-08-11)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 12:04:41 -07:00
482e1488a9 t0001: fix broken not-quite getcwd(3) test in bed67874e2
With a54e938e5b (strbuf: support long paths w/o read rights in
strbuf_getcwd() on FreeBSD, 2017-03-26) we had t0001 break on systems
like OpenBSD and AIX whose getcwd(3) has standard (but not like glibc
et al) behavior.

This was partially fixed in bed67874e2 (t0001: skip test with
restrictive permissions if getpwd(3) respects them, 2017-08-07).

The problem with that fix is that while its analysis of the problem is
correct, it doesn't actually call getcwd(3), instead it invokes "pwd
-P". There is no guarantee that "pwd -P" is going to call getcwd(3),
as opposed to e.g. being a shell built-in.

On AIX under both bash and ksh this test breaks because "pwd -P" will
happily display the current working directory, but getcwd(3) called by
the "git init" we're testing here will fail to get it.

I checked whether clobbering the $PWD environment variable would
affect it, and it didn't. Presumably these shells keep track of their
working directory internally.

There's possible follow-up work here in teaching strbuf_getcwd() to
get the working directory with whatever method "pwd" uses on these
platforms. See [1] for a discussion of that, but let's take the easy
way out here and just skip these tests by fixing the
GETCWD_IGNORES_PERMS prerequisite to match the limitations of
strbuf_getcwd().

1. https://lore.kernel.org/git/b650bef5-d739-d98d-e9f1-fa292b6ce982@web.de/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 10:18:27 -07:00
013c7e2b07 http: drop support for curl < 7.16.0
In the last commit we dropped support for curl < 7.11.1, let's
continue that and drop support for versions older than 7.16.0. This
allows us to get rid of some now-obsolete #ifdefs.

Choosing 7.16.0 is a somewhat arbitrary cutoff:

  1. It came out in October of 2006, almost 15 years ago.
     Besides being a nice round number, around 10 years is
     a common end-of-life support period, even for conservative
     distributions.

  2. That version introduced the curl_multi interface, which
     gives us a lot of bang for the buck in removing #ifdefs

RHEL 5 came with curl 7.15.5[1] (released in August 2006). RHEL 5's
extended life cycle program ended on 2020-11-30[1]. RHEL 6 comes with
curl 7.19.7 (released in November 2009), and RHEL 7 comes with
7.29.0 (released in February 2013).

1. http://lore.kernel.org/git/873e1f31-2a96-5b72-2f20-a5816cad1b51@jupiterrise.com

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:11:15 -07:00
1119a15b5c http: drop support for curl < 7.11.1
Drop support for this ancient version of curl and simplify the code by
allowing us get rid of some "#ifdef"'s.

Git will not build with vanilla curl older than 7.11.1 due our use of
CURLOPT_POSTFIELDSIZE in 37ee680d9b
(http.postbuffer: allow full range of ssize_t values,
2017-04-11). This field was introduced in curl 7.11.1.

We could solve these compilation problems with more #ifdefs,
but it's not worth the trouble. Version 7.11.1 came out in
March of 2004, over 17 years ago. Let's declare that too old
and drop any existing ifdefs that go further back. One
obvious benefit is that we'll have fewer conditional bits
cluttering the code.

This patch drops all #ifdefs that reference older versions
(note that curl's preprocessor macros are in hex, so we're
looking for 070b01, not 071101).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:11:15 -07:00
f0b922473e Documentation: render special characters correctly
Three hyphens are rendered verbatim, so "--" has to be used to produce a
dash.  There is no double arrow ("<->" is rendered as "<→"), so a left
and right arrow "<-->" have to be combined for that.

So fix asciidoc output for special characters.  This is similar to fixes
in commit de82095a95 (doc hash-function-transition: fix asciidoc output,
2021-02-05).

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:08:12 -07:00
092e5115d1 merge-ort: reuse path strings in pool_alloc_filespec
pool_alloc_filespec() was written so that the code when pool != NULL
mimicked the code from alloc_filespec(), which including allocating
enough extra space for the path and then copying it.  However, the path
passed to pool_alloc_filespec() is always going to already be in the
same memory pool, so we may as well reuse it instead of copying it.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       198.5 ms ±  3.4 ms     198.3 ms ±  2.9 ms
    mega-renames:     679.1 ms ±  5.6 ms     661.8 ms ±  5.9 ms
    just-one-mega:    271.9 ms ±  2.8 ms     264.6 ms ±  2.5 ms

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:01:19 -07:00
f239fff4c1 merge-ort: store filepairs and filespecs in our mem_pool
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       198.1 ms ±  2.6 ms     198.5 ms ±  3.4 ms
    mega-renames:     715.8 ms ±  4.0 ms     679.1 ms ±  5.6 ms
    just-one-mega:    276.8 ms ±  4.2 ms     271.9 ms ±  2.8 ms

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:01:19 -07:00
a8791ef649 diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc
We want to be able to allocate filespecs and filepairs using a mem_pool.
However, filespec data will still remain outside the pool (perhaps in
the future we could plumb the pool through the various diff APIs to
allocate the filespec data too, but for now we are limiting the scope).
Add some extra functions to allocate these appropriately based on the
non-NULL-ness of opt->priv->pool, as well as some extra functions to
handle correctly deallocating the relevant parts of them.  A future
commit will make use of these new functions.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:01:19 -07:00
6697ee01b5 merge-ort: switch our strmaps over to using memory pools
For all the strmaps (including strintmaps and strsets) whose memory is
unconditionally freed as part of clear_or_reinit_internal_opts(), switch
them over to using our new memory pool.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:      202.5  ms ±  3.2  ms    198.1 ms ±  2.6 ms
    mega-renames:      1.072 s ±  0.012 s    715.8 ms ±  4.0 ms
    just-one-mega:   357.3  ms ±  3.9  ms    276.8 ms ±  4.2 ms

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:01:19 -07:00
4137c54b90 merge-ort: set up a memory pool
merge-ort has a lot of data structures, and they all tend to be freed
together in clear_or_reinit_internal_opts().  Set up a memory pool to
allow us to make these allocations and deallocations faster.  Future
commits will adjust various callers to make use of this memory pool.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:01:18 -07:00
cdf2241c71 merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers
Make the code more flexible so that it can handle both being run with or
without a memory pool by adding utility functions which will either call
    xmalloc, xcalloc, xstrndup
or
    mem_pool_alloc, mem_pool_calloc, mem_pool_strndup
depending on whether we have a non-NULL memory pool.  A subsequent
commit will make use of these.

(We will actually be dropping these functions soon and just assuming we
always have a memory pool, but the flexibility was very useful during
development of merge-ort so I want to be able to restore it if needed.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:01:18 -07:00
fa0e936fbb diffcore-rename: use a mem_pool for exact rename detection's hashmap
Exact rename detection, via insert_file_table(), uses a hashmap to store
files by oid.  Use a mem_pool for the hashmap entries so these can all be
allocated and deallocated together.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:      204.2  ms ±  3.0  ms   202.5  ms ±  3.2  ms
    mega-renames:      1.076 s ±  0.015 s     1.072 s ±  0.012 s
    just-one-mega:   364.1  ms ±  7.0  ms   357.3  ms ±  3.9  ms

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:01:18 -07:00
7afc0b03a2 merge-ort: rename str{map,intmap,set}_func()
In order to make it clearer that these three variables holding a
function refer to functions that will clear the strmap/strintmap/strset,
rename them to str{map,intmap,set}_clear_func().

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:01:18 -07:00
42f8ed6ca2 add: remove ensure_full_index() with --renormalize
The --renormalize option updates the EOL conversions for the tracked
files. However, the loop already ignores files marked with the
SKIP_WORKTREE bit, so it will continue to do so with a sparse index
because the sparse directory entries also have this bit set.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-29 12:36:34 -07:00
939fa07582 add: ignore outside the sparse-checkout in refresh()
Since b243012 (refresh_index(): add flag to ignore SKIP_WORKTREE
entries, 2021-04-08), 'git add --refresh <path>' will output a warning
message when the path is outside the sparse-checkout definition. The
implementation of this warning happened in parallel with the
sparse-index work to add ensure_full_index() calls throughout the
codebase.

Update this loop to have the proper logic that checks to see if the
pathspec is outside the sparse-checkout definition. This avoids the need
to expand the sparse directory entry and determine if the path is
tracked, untracked, or ignored. We simply avoid updating the stat()
information because there isn't even an entry that matches the path!

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-29 12:36:34 -07:00
4eaffd81a5 pathspec: stop calling ensure_full_index
The add_pathspec_matches_against_index() focuses on matching a pathspec
to file entries in the index. This already works correctly for its only
use: checking if untracked files exist in the index.

The compatibility checks in t1092 already test that 'git add <dir>'
works for a directory outside of the sparse cone. That provides coverage
for removing this guard.

This finalizes our ability to run 'git add .' without expanding a sparse
index to a full one. This is evidenced by an update to t1092 and by
these performance numbers for p2000-sparse-operations.sh:

Test                                    HEAD~1            HEAD
--------------------------------------------------------------------------------
2000.10: git add . (full-index-v3)      0.37(0.28+0.07)   0.36(0.27+0.06) -2.7%
2000.11: git add . (full-index-v4)      0.33(0.26+0.06)   0.32(0.28+0.05) -3.0%
2000.12: git add . (sparse-index-v3)    0.57(0.53+0.07)   0.06(0.06+0.07) -89.5%
2000.13: git add . (sparse-index-v4)    0.57(0.53+0.07)   0.05(0.03+0.09) -91.2%

While the ~90% improvement is shown by the test results, it is worth
noting that expanding the sparse index was adding overhead in previous
commits. Comparing to the full index case, we see the performance go
from 0.33s to 0.05s, an 85% improvement.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-29 12:36:34 -07:00
5e7cbab196 add: allow operating on a sparse-only index
Disable command_requires_full_index for 'git add'. This does not require
any additional removals of ensure_full_index(). The main reason is that
'git add' discovers changes based on the pathspec and the worktree
itself. These are then inserted into the index directly, and calls to
index_name_pos() or index_file_exists() already call expand_to_path() at
the appropriate time to support a sparse-index.

Add a test to check that 'git add -A' and 'git add <file>' does not
expand the index at all, as long as <file> is not within a sparse
directory. This does not help the global 'git add .' case.

We can measure the improvement using p2000-sparse-operations.sh with
these results:

Test                                  HEAD~1           HEAD
------------------------------------------------------------------------------
2000.6: git add -A (full-index-v3)    0.35(0.30+0.05)  0.37(0.29+0.06) +5.7%
2000.7: git add -A (full-index-v4)    0.31(0.26+0.06)  0.33(0.27+0.06) +6.5%
2000.8: git add -A (sparse-index-v3)  0.57(0.53+0.07)  0.05(0.04+0.08) -91.2%
2000.9: git add -A (sparse-index-v4)  0.58(0.55+0.06)  0.05(0.05+0.06) -91.4%

While the 91% improvement seems impressive, it's important to recognize
that previously we had significant overhead for expanding the
sparse-index. Comparing to the full index case, 'git add -A' goes from
0.37s to 0.05s, which is "only" an 86% improvement.

This modification to 'git add' creates some behavior change depending on
the use of a sparse index. We modify a test in t1092 to demonstrate
these changes which will be remedied in future changes.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-29 12:36:34 -07:00
83ad8ca596 t1092: test merge conflicts outside cone
Conflicts can occur outside of the sparse-checkout definition, and in
that case users might try to resolve the conflicts in several ways.
Document a few of these ways in a test. Make it clear that this behavior
is not necessarily the optimal flow, since users can become confused
when Git deletes these files from the worktree in later commands.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-29 12:36:34 -07:00
940fe202ad The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28 13:18:05 -07:00
1d07640b65 Merge branch 'ps/t0000-output-directory-fix'
"TEST_OUTPUT_DIRECTORY=there make test" failed to work, which has
been corrected.

* ps/t0000-output-directory-fix:
  t0000: fix test if run with TEST_OUTPUT_DIRECTORY
2021-07-28 13:18:05 -07:00
7f554a4f69 Merge branch 'tb/reverse-midx'
The code that gives an error message in "git multi-pack-index" when
no subcommand is given tried to print a NULL pointer as a strong,
which has been corrected.

* tb/reverse-midx:
  multi-pack-index: fix potential segfault without sub-command
2021-07-28 13:18:04 -07:00
aaf113ed95 Merge branch 'hn/refs-debug-empty-prefix'
Debugging aid.

* hn/refs-debug-empty-prefix:
  refs/debug: quote prefix
2021-07-28 13:18:04 -07:00
fa8b225d86 Merge branch 'pb/dont-complete-aliased-options'
The completion support used to offer alternate spelling of options
that exist only for compatibility, which has been corrected.

* pb/dont-complete-aliased-options:
  parse-options: don't complete option aliases by default
2021-07-28 13:18:03 -07:00
268055bfde Merge branch 'en/rename-limits-doc'
Documentation on "git diff -l<n>" and diff.renameLimit have been
updated, and the defaults for these limits have been raised.

* en/rename-limits-doc:
  rename: bump limit defaults yet again
  diffcore-rename: treat a rename_limit of 0 as unlimited
  doc: clarify documentation for rename/copy limits
  diff: correct warning message when renameLimit exceeded
2021-07-28 13:18:03 -07:00
546adc4950 Merge branch 'ds/gender-neutral-doc-guidelines'
A guideline for gender neutral documentation has been added.

* ds/gender-neutral-doc-guidelines:
  CodingGuidelines: recommend gender-neutral description
2021-07-28 13:18:02 -07:00
b271a3034f Merge branch 'ds/status-with-sparse-index'
"git status" codepath learned to work with sparsely populated index
without hydrating it fully.

* ds/status-with-sparse-index:
  t1092: document bad sparse-checkout behavior
  fsmonitor: integrate with sparse index
  wt-status: expand added sparse directory entries
  status: use sparse-index throughout
  status: skip sparse-checkout percentage with sparse-index
  diff-lib: handle index diffs with sparse dirs
  dir.c: accept a directory as part of cone-mode patterns
  unpack-trees: unpack sparse directory entries
  unpack-trees: rename unpack_nondirectories()
  unpack-trees: compare sparse directories correctly
  unpack-trees: preserve cache_bottom
  t1092: add tests for status/add and sparse files
  t1092: expand repository data shape
  t1092: replace incorrect 'echo' with 'cat'
  sparse-index: include EXTENDED flag when expanding
  sparse-index: skip indexes with unmerged entries
2021-07-28 13:18:02 -07:00
6d56fb28fb Merge branch 'js/ci-make-sparse'
The CI gained a new job to run "make sparse" check.

* js/ci-make-sparse:
  ci/install-dependencies: handle "sparse" job package installs
  ci: run "apt-get update" before "apt-get install"
  ci: run `make sparse` as part of the GitHub workflow
2021-07-28 13:18:01 -07:00
5bae927222 Merge branch 'ab/pkt-line-tests'
Tests that cover protocol bits have been updated and helpers
used there have been consolidated.

* ab/pkt-line-tests:
  test-lib-functions: use test-tool for [de]packetize()
2021-07-28 13:18:00 -07:00
4d4c8ddd37 Merge branch 'jk/t0000-subtests-fix'
Test fix.

* jk/t0000-subtests-fix:
  t0000: clear GIT_SKIP_TESTS before running sub-tests
2021-07-28 13:18:00 -07:00
6ca224f156 Merge branch 'dl/diff-merge-base'
"git diff --merge-base" documentation has been updated.

* dl/diff-merge-base:
  git-diff: fix missing --merge-base docs
2021-07-28 13:17:59 -07:00
dd6d3c90ee Merge branch 'ab/attribute-format'
Many "printf"-like helper functions we have have been annotated
with __attribute__() to catch placeholder/parameter mismatches.

* ab/attribute-format:
  advice.h: add missing __attribute__((format)) & fix usage
  *.h: add a few missing __attribute__((format))
  *.c static functions: add missing __attribute__((format))
  sequencer.c: move static function to avoid forward decl
  *.c static functions: don't forward-declare __attribute__
2021-07-28 13:17:59 -07:00
c9d6d8a193 Merge branch 'jk/log-decorate-optim'
Optimize "git log" for cases where we wasted cycles to load ref
decoration data that may not be needed.

* jk/log-decorate-optim:
  load_ref_decorations(): fix decoration with tags
  add_ref_decoration(): rename s/type/deco_type/
  load_ref_decorations(): avoid parsing non-tag objects
  object.h: add lookup_object_by_type() function
  object.h: expand docstring for lookup_unknown_object()
  log: avoid loading decorations for userformats that don't need it
  pretty.h: update and expand docstring for userformat_find_requirements()
2021-07-28 13:17:58 -07:00
01369fdfd3 Merge branch 'sm/worktree-add-lock'
"git worktree add --lock" learned to record why the worktree is
locked with a custom message.

* sm/worktree-add-lock:
  worktree: teach `add` to accept --reason <string> with --lock
  worktree: mark lock strings with `_()` for translation
  t2400: clean up '"add" worktree with lock' test
2021-07-28 13:17:58 -07:00
e5cc59c77c Merge branch 'ew/many-alternate-optim'
Optimization for repositories with many alternate object store.

* ew/many-alternate-optim:
  oidtree: a crit-bit tree for odb_loose_cache
  oidcpy_with_padding: constify `src' arg
  make object_directory.loose_objects_subdir_seen a bitmap
  avoid strlen via strbuf_addstr in link_alt_odb_entry
  speed up alt_odb_usable() with many alternates
2021-07-28 13:17:57 -07:00
14793a4e37 Merge branch 'hj/commit-allow-empty-message'
"git commit --allow-empty-message" won't abort the operation upon
an empty message, but the hint shown in the editor said otherwise.

* hj/commit-allow-empty-message:
  commit: remove irrelavent prompt on `--allow-empty-message`
  commit: reorganise commit hint strings
2021-07-28 13:17:57 -07:00
1e893a1216 Merge branch 'dl/packet-read-response-end-fix'
Error message update.

* dl/packet-read-response-end-fix:
  pkt-line: replace "stateless separator" with "response end"
2021-07-28 13:17:56 -07:00
1fcc40cd1d bisect: simplify return code from bisect_checkout()
The function was designed to return only BISECT_OK (0) or
BISECT_FAILED (-1) and no other values, but there were two issues:

 - The comment misspelled BISECT_FAILED as BISECT_FAILURE, even
   though the logic it described (i.e. any non-zero return should be
   reported as a single BISECT_FAILED) was correct.

 - It took the return value from run_command_v_opt(), and assumed it
   was either -1 or 1 upon error, which is not the case; it can relay
   errors from wait_or_whine(), which can report exit status of the
   child process.

Translate any error return from run_command_v_opt() to BISECT_FAILED,
and simplify the resulting code by losing the 'res' variable that is
no longer needed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28 10:57:37 -07:00
ffcb4e94d3 bisect: do not run show-branch just to show the current commit
In scripted versions of "git bisect", we used "git show-branch" to
describe a single commit in the bisect log and also to the interactive
user after checking out the next version to be tested.

The former use of "git show-branch" was lost when the helper
function that wrote bisect log entries was rewritten at 0f30233a
(bisect--helper: `bisect_write` shell function in C, 2019-01-02) in
C

But we've kept the latter ever since 0871984d (bisect: make "git
bisect" use new "--next-all" bisect-helper function, 2009-05-09)
started using the faithful C-rewrite introduced at ef24c7ca
(bisect--helper: add "--next-exit" to output bisect results,
2009-04-19).

Showing "[<full hex>] <subject>" is simple enough with our helper
pretty.c::format_commit_message() and spawning show-branch is an
overkill.  Let's lose one external process.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28 10:57:26 -07:00
4577d26dc0 t4060: remove unused variable
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 15:35:54 -07:00
27f45ccf33 ci/install-dependencies: handle "sparse" job package installs
This just matches the style/location of the package installation for
other jobs. There should be no functional change.

I did flip the order of the options and command-name ("-y update"
instead of "update -y") for consistency with other lines in the same
file.

Note also that we have to reorder the dependency install with the
"checkout" action, so that we actually have the "ci" scripts available.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 15:20:51 -07:00
8231c841ff ci: run "apt-get update" before "apt-get install"
The "sparse" workflow runs "apt-get install" to pick up a few necessary
packages. But it needs to run "apt-get update" first, or it risks trying
to download an old package version that no longer exists. And in fact
this happens now, with output like:

  2021-07-26T17:40:51.2551880Z E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/c/curl/libcurl4-openssl-dev_7.68.0-1ubuntu2.5_amd64.deb  404  Not Found [IP: 52.147.219.192 80]
  2021-07-26T17:40:51.2554304Z E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

Our other ci jobs don't suffer from this; they rely on scripts in ci/,
and ci/install-dependencies does the appropriate "apt-get update".

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 15:20:37 -07:00
7ed37eb8ae expand_user_path: allow in-flight topics to keep using the old name
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 14:55:05 -07:00
9a863b3358 reset: clear_unpack_trees_porcelain to plug leak
setup_unpack_trees_porcelain() populates various fields on
unpack_tree_opts, we need to call clear_unpack_trees_porcelain() to
avoid leaking them. Specifically, we used to leak
unpack_tree_opts.msgs_to_free.

We have to do this in leave_reset_head because there are multiple
scenarios where unpack_tree_opts has already been configured, followed
by a 'goto leave_reset_head'. But we can also 'goto leave_reset_head'
prior to having initialised unpack_tree_opts via memset(..., 0, ...).
Therefore we also move unpack_tree_opts initialisation to the start of
reset_head(), and convert it to use brace initialisation - which
guarantees that we can never clear an uninitialised unpack_tree_opts.
clear_unpack_tree_opts() is always safe to call as long as
unpack_tree_opts is at least zero-initialised, i.e. it does not depend
on a previous call to setup_unpack_trees_porcelain().

LSAN output from t0021:

Direct leak of 192 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa721e5 in xrealloc wrapper.c:126:8
    #2 0x9f7861 in strvec_push_nodup strvec.c:19:2
    #3 0x9f7861 in strvec_pushf strvec.c:39:2
    #4 0xa43e14 in setup_unpack_trees_porcelain unpack-trees.c:129:3
    #5 0x97e011 in reset_head reset.c:53:2
    #6 0x61dfa5 in cmd_rebase builtin/rebase.c:1991:9
    #7 0x4ce83e in run_builtin git.c:475:11
    #8 0x4ccafe in handle_builtin git.c:729:3
    #9 0x4cb01c in run_argv git.c:818:4
    #10 0x4cb01c in cmd_main git.c:949:19
    #11 0x6b3f3d in main common-main.c:52:11
    #12 0x7fa8addf3349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 147 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa721e5 in xrealloc wrapper.c:126:8
    #2 0x9e8d54 in strbuf_grow strbuf.c:98:2
    #3 0x9e8d54 in strbuf_vaddf strbuf.c:401:3
    #4 0x9f7774 in strvec_pushf strvec.c:36:2
    #5 0xa43e14 in setup_unpack_trees_porcelain unpack-trees.c:129:3
    #6 0x97e011 in reset_head reset.c:53:2
    #7 0x61dfa5 in cmd_rebase builtin/rebase.c:1991:9
    #8 0x4ce83e in run_builtin git.c:475:11
    #9 0x4ccafe in handle_builtin git.c:729:3
    #10 0x4cb01c in run_argv git.c:818:4
    #11 0x4cb01c in cmd_main git.c:949:19
    #12 0x6b3f3d in main common-main.c:52:11
    #13 0x7fa8addf3349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 134 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa721e5 in xrealloc wrapper.c:126:8
    #2 0x9e8d54 in strbuf_grow strbuf.c:98:2
    #3 0x9e8d54 in strbuf_vaddf strbuf.c:401:3
    #4 0x9f7774 in strvec_pushf strvec.c:36:2
    #5 0xa43fe4 in setup_unpack_trees_porcelain unpack-trees.c:168:3
    #6 0x97e011 in reset_head reset.c:53:2
    #7 0x61dfa5 in cmd_rebase builtin/rebase.c:1991:9
    #8 0x4ce83e in run_builtin git.c:475:11
    #9 0x4ccafe in handle_builtin git.c:729:3
    #10 0x4cb01c in run_argv git.c:818:4
    #11 0x4cb01c in cmd_main git.c:949:19
    #12 0x6b3f3d in main common-main.c:52:11
    #13 0x7fa8addf3349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 130 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa721e5 in xrealloc wrapper.c:126:8
    #2 0x9e8d54 in strbuf_grow strbuf.c:98:2
    #3 0x9e8d54 in strbuf_vaddf strbuf.c:401:3
    #4 0x9f7774 in strvec_pushf strvec.c:36:2
    #5 0xa43f20 in setup_unpack_trees_porcelain unpack-trees.c:150:3
    #6 0x97e011 in reset_head reset.c:53:2
    #7 0x61dfa5 in cmd_rebase builtin/rebase.c:1991:9
    #8 0x4ce83e in run_builtin git.c:475:11
    #9 0x4ccafe in handle_builtin git.c:729:3
    #10 0x4cb01c in run_argv git.c:818:4
    #11 0x4cb01c in cmd_main git.c:949:19
    #12 0x6b3f3d in main common-main.c:52:11
    #13 0x7fa8addf3349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 603 byte(s) leaked in 4 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:21 -07:00
b54cf3a766 builtin/rebase: fix options.strategy memory lifecycle
- cmd_rebase populates rebase_options.strategy with newly allocated
  strings, hence we need to free those strings at the end of cmd_rebase
  to avoid a leak.
- In some cases: get_replay_opts() is called, which prepares replay_opts
  using data from rebase_options. We used to simply copy the pointer
  from rebase_options.strategy,  however that would now result in a
  double-free because sequencer_remove_state() is eventually used to
  free replay_opts.strategy. To avoid this we xstrdup() strategy when
  adding it to replay_opts.

The original leak happens because we always populate
rebase_options.strategy, but we don't always enter the path that calls
get_replay_opts() and later sequencer_remove_state() - in  other words
we'd always allocate a new string into rebase_options.strategy but
only sometimes did we free it. We now make sure that rebase_options
and replay_opts both own their own copies of strategy, and each copy
is free'd independently.

This was first seen when running t0021 with LSAN, but t2012 helped catch
the fact that we can't just free(options.strategy) at the end of
cmd_rebase (as that can cause a double-free). LSAN output from t0021:

LSAN output from t0021:

Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xa71eb8 in xstrdup wrapper.c:29:14
    #2 0x61b1cc in cmd_rebase builtin/rebase.c:1779:22
    #3 0x4ce83e in run_builtin git.c:475:11
    #4 0x4ccafe in handle_builtin git.c:729:3
    #5 0x4cb01c in run_argv git.c:818:4
    #6 0x4cb01c in cmd_main git.c:949:19
    #7 0x6b3fad in main common-main.c:52:11
    #8 0x7f267b512349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:21 -07:00
8c05e42c7a builtin/merge: free found_ref when done
merge_name() calls dwim_ref(), which allocates a new string into
found_ref. Therefore add a free() to avoid leaking found_ref.

LSAN output from t0021:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xa8beb8 in xstrdup wrapper.c:29:14
    #2 0x954054 in expand_ref refs.c:671:12
    #3 0x953cb6 in repo_dwim_ref refs.c:644:22
    #4 0x5d3759 in dwim_ref refs.h:162:9
    #5 0x5d3759 in merge_name builtin/merge.c:517:6
    #6 0x5d3759 in collect_parents builtin/merge.c:1214:5
    #7 0x5cf60d in cmd_merge builtin/merge.c:1458:16
    #8 0x4ce83e in run_builtin git.c:475:11
    #9 0x4ccafe in handle_builtin git.c:729:3
    #10 0x4cb01c in run_argv git.c:818:4
    #11 0x4cb01c in cmd_main git.c:949:19
    #12 0x6bdbfd in main common-main.c:52:11
    #13 0x7f0430502349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
ed3c566d97 builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
These leaks all happen at the end of cmd_mv, hence don't matter in any
way. But we still fix the easy ones and squash the rest to get us closer
to being able to run tests without leaks.

LSAN output from t0050:

Direct leak of 384 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa8c015 in xrealloc wrapper.c:126:8
    #2 0xa0a7e1 in add_entry string-list.c:44:2
    #3 0xa0a7e1 in string_list_insert string-list.c:58:14
    #4 0x5dac03 in cmd_mv builtin/mv.c:248:4
    #5 0x4ce83e in run_builtin git.c:475:11
    #6 0x4ccafe in handle_builtin git.c:729:3
    #7 0x4cb01c in run_argv git.c:818:4
    #8 0x4cb01c in cmd_main git.c:949:19
    #9 0x6bd9ad in main common-main.c:52:11
    #10 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0xa8bd09 in do_xmalloc wrapper.c:41:8
    #2 0x5dbc34 in internal_prefix_pathspec builtin/mv.c:32:2
    #3 0x5da575 in cmd_mv builtin/mv.c:158:14
    #4 0x4ce83e in run_builtin git.c:475:11
    #5 0x4ccafe in handle_builtin git.c:729:3
    #6 0x4cb01c in run_argv git.c:818:4
    #7 0x4cb01c in cmd_main git.c:949:19
    #8 0x6bd9ad in main common-main.c:52:11
    #9 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0xa8bd09 in do_xmalloc wrapper.c:41:8
    #2 0x5dbc34 in internal_prefix_pathspec builtin/mv.c:32:2
    #3 0x5da4e4 in cmd_mv builtin/mv.c:148:11
    #4 0x4ce83e in run_builtin git.c:475:11
    #5 0x4ccafe in handle_builtin git.c:729:3
    #6 0x4cb01c in run_argv git.c:818:4
    #7 0x4cb01c in cmd_main git.c:949:19
    #8 0x6bd9ad in main common-main.c:52:11
    #9 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x49a9a2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0xa8c119 in xcalloc wrapper.c:140:8
    #2 0x5da585 in cmd_mv builtin/mv.c:159:22
    #3 0x4ce83e in run_builtin git.c:475:11
    #4 0x4ccafe in handle_builtin git.c:729:3
    #5 0x4cb01c in run_argv git.c:818:4
    #6 0x4cb01c in cmd_main git.c:949:19
    #7 0x6bd9ad in main common-main.c:52:11
    #8 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x49a9a2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0xa8c119 in xcalloc wrapper.c:140:8
    #2 0x5da4f8 in cmd_mv builtin/mv.c:149:10
    #3 0x4ce83e in run_builtin git.c:475:11
    #4 0x4ccafe in handle_builtin git.c:729:3
    #5 0x4cb01c in run_argv git.c:818:4
    #6 0x4cb01c in cmd_main git.c:949:19
    #7 0x6bd9ad in main common-main.c:52:11
    #8 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa8c015 in xrealloc wrapper.c:126:8
    #2 0xa00226 in strbuf_grow strbuf.c:98:2
    #3 0xa00226 in strbuf_vaddf strbuf.c:394:3
    #4 0xa065c7 in xstrvfmt strbuf.c:981:2
    #5 0xa065c7 in xstrfmt strbuf.c:991:8
    #6 0x9e7ce7 in prefix_path_gently setup.c:115:15
    #7 0x9e7fa6 in prefix_path setup.c:128:12
    #8 0x5dbdbf in internal_prefix_pathspec builtin/mv.c:55:23
    #9 0x5da575 in cmd_mv builtin/mv.c:158:14
    #10 0x4ce83e in run_builtin git.c:475:11
    #11 0x4ccafe in handle_builtin git.c:729:3
    #12 0x4cb01c in run_argv git.c:818:4
    #13 0x4cb01c in cmd_main git.c:949:19
    #14 0x6bd9ad in main common-main.c:52:11
    #15 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa8c015 in xrealloc wrapper.c:126:8
    #2 0xa00226 in strbuf_grow strbuf.c:98:2
    #3 0xa00226 in strbuf_vaddf strbuf.c:394:3
    #4 0xa065c7 in xstrvfmt strbuf.c:981:2
    #5 0xa065c7 in xstrfmt strbuf.c:991:8
    #6 0x9e7ce7 in prefix_path_gently setup.c:115:15
    #7 0x9e7fa6 in prefix_path setup.c:128:12
    #8 0x5dbdbf in internal_prefix_pathspec builtin/mv.c:55:23
    #9 0x5da4e4 in cmd_mv builtin/mv.c:148:11
    #10 0x4ce83e in run_builtin git.c:475:11
    #11 0x4ccafe in handle_builtin git.c:729:3
    #12 0x4cb01c in run_argv git.c:818:4
    #13 0x4cb01c in cmd_main git.c:949:19
    #14 0x6bd9ad in main common-main.c:52:11
    #15 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 558 byte(s) leaked in 7 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
675ea4efdb convert: release strbuf to avoid leak
apply_multi_file_filter and async_query_available_blobs both query
subprocess output using subprocess_read_status, which writes data into
the identically named filter_status strbuf. We add a strbuf_release to
avoid leaking their contents.

Leak output seen when running t0021 with LSAN:

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa8c2b5 in xrealloc wrapper.c:126:8
    #2 0x9ff99d in strbuf_grow strbuf.c:98:2
    #3 0x9ff99d in strbuf_addbuf strbuf.c:304:2
    #4 0xa101d6 in subprocess_read_status sub-process.c:45:5
    #5 0x77793c in apply_multi_file_filter convert.c:886:8
    #6 0x77793c in apply_filter convert.c:1042:10
    #7 0x77a0b5 in convert_to_git_filter_fd convert.c:1492:7
    #8 0x8b48cd in index_stream_convert_blob object-file.c:2156:2
    #9 0x8b48cd in index_fd object-file.c:2248:9
    #10 0x597411 in hash_fd builtin/hash-object.c:43:9
    #11 0x596be1 in hash_object builtin/hash-object.c:59:2
    #12 0x596be1 in cmd_hash_object builtin/hash-object.c:153:3
    #13 0x4ce83e in run_builtin git.c:475:11
    #14 0x4ccafe in handle_builtin git.c:729:3
    #15 0x4cb01c in run_argv git.c:818:4
    #16 0x4cb01c in cmd_main git.c:949:19
    #17 0x6bdc2d in main common-main.c:52:11
    #18 0x7f42acf79349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).

Direct leak of 120 byte(s) in 5 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa8c295 in xrealloc wrapper.c:126:8
    #2 0x9ff97d in strbuf_grow strbuf.c:98:2
    #3 0x9ff97d in strbuf_addbuf strbuf.c:304:2
    #4 0xa101b6 in subprocess_read_status sub-process.c:45:5
    #5 0x775c73 in async_query_available_blobs convert.c:960:8
    #6 0x80029d in finish_delayed_checkout entry.c:183:9
    #7 0xa65d1e in check_updates unpack-trees.c:493:10
    #8 0xa5f469 in unpack_trees unpack-trees.c:1747:8
    #9 0x525971 in checkout builtin/clone.c:815:6
    #10 0x525971 in cmd_clone builtin/clone.c:1409:8
    #11 0x4ce83e in run_builtin git.c:475:11
    #12 0x4ccafe in handle_builtin git.c:729:3
    #13 0x4cb01c in run_argv git.c:818:4
    #14 0x4cb01c in cmd_main git.c:949:19
    #15 0x6bdc2d in main common-main.c:52:11
    #16 0x7fa253fce349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 120 byte(s) leaked in 5 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
8d833e9337 read-cache: call diff_setup_done to avoid leak
repo_diff_setup() calls through to diff.c's static prep_parse_options(),
which in  turn allocates a new array into diff_opts.parseopts.
diff_setup_done() is responsible for freeing that array, and has the
benefit of verifying diff_opts too - hence we add a call to
diff_setup_done() to avoid leaking parseopts.

Output from the leak as found while running t0090 with LSAN:

Direct leak of 7120 byte(s) in 1 object(s) allocated from:
    #0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0xa8bf89 in do_xmalloc wrapper.c:41:8
    #2 0x7a7bae in prep_parse_options diff.c:5636:2
    #3 0x7a7bae in repo_diff_setup diff.c:4611:2
    #4 0x93716c in repo_index_has_changes read-cache.c:2518:3
    #5 0x872233 in unclean merge-ort-wrappers.c:12:14
    #6 0x872233 in merge_ort_recursive merge-ort-wrappers.c:53:6
    #7 0x5d5b11 in try_merge_strategy builtin/merge.c:752:12
    #8 0x5d0b6b in cmd_merge builtin/merge.c:1666:9
    #9 0x4ce83e in run_builtin git.c:475:11
    #10 0x4ccafe in handle_builtin git.c:729:3
    #11 0x4cb01c in run_argv git.c:818:4
    #12 0x4cb01c in cmd_main git.c:949:19
    #13 0x6bdc2d in main common-main.c:52:11
    #14 0x7f551eb51349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 7120 byte(s) leaked in 1 allocation(s)

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
d7cf4188e2 ref-filter: also free head for ATOM_HEAD to avoid leak
u.head is populated using resolve_refdup(), which returns a newly
allocated string - hence we also need to free() it.

Found while running t0041 with LSAN:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xa8be98 in xstrdup wrapper.c:29:14
    #2 0x9481db in head_atom_parser ref-filter.c:549:17
    #3 0x9408c7 in parse_ref_filter_atom ref-filter.c:703:30
    #4 0x9400e3 in verify_ref_format ref-filter.c:974:8
    #5 0x4f9e8b in print_ref_list builtin/branch.c:439:6
    #6 0x4f9e8b in cmd_branch builtin/branch.c:757:3
    #7 0x4ce83e in run_builtin git.c:475:11
    #8 0x4ccafe in handle_builtin git.c:729:3
    #9 0x4cb01c in run_argv git.c:818:4
    #10 0x4cb01c in cmd_main git.c:949:19
    #11 0x6bdc2d in main common-main.c:52:11
    #12 0x7f96edf86349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
4e3250b7fb diffcore-rename: move old_dir/new_dir definition to plug leak
old_dir/new_dir are free()'d at the end of update_dir_rename_counts,
however if we return early we'll never free those strings. Therefore
we should move all new allocations after the possible early return,
avoiding a leak.

This seems like a fairly recent leak, that started happening since the
early-return was added in:
  1ad69eb0dc (diffcore-rename: compute dir_rename_counts in stages, 2021-02-27)

LSAN output from t0022:

Direct leak of 7 byte(s) in 1 object(s) allocated from:
    #0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xa71e48 in xstrdup wrapper.c:29:14
    #2 0x7db9c7 in update_dir_rename_counts diffcore-rename.c:464:12
    #3 0x7db6ae in find_renames diffcore-rename.c:1062:3
    #4 0x7d76c3 in diffcore_rename_extended diffcore-rename.c:1472:18
    #5 0x7b4cfc in diffcore_std diff.c:6705:4
    #6 0x855e46 in log_tree_diff_flush log-tree.c:846:2
    #7 0x856574 in log_tree_diff log-tree.c:955:3
    #8 0x856574 in log_tree_commit log-tree.c:986:10
    #9 0x9a9c67 in print_commit_summary sequencer.c:1329:7
    #10 0x52e623 in cmd_commit builtin/commit.c:1862:3
    #11 0x4ce83e in run_builtin git.c:475:11
    #12 0x4ccafe in handle_builtin git.c:729:3
    #13 0x4cb01c in run_argv git.c:818:4
    #14 0x4cb01c in cmd_main git.c:949:19
    #15 0x6b3f3d in main common-main.c:52:11
    #16 0x7fe397c7a349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 7 byte(s) in 1 object(s) allocated from:
    #0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xa71e48 in xstrdup wrapper.c:29:14
    #2 0x7db9bc in update_dir_rename_counts diffcore-rename.c:463:12
    #3 0x7db6ae in find_renames diffcore-rename.c:1062:3
    #4 0x7d76c3 in diffcore_rename_extended diffcore-rename.c:1472:18
    #5 0x7b4cfc in diffcore_std diff.c:6705:4
    #6 0x855e46 in log_tree_diff_flush log-tree.c:846:2
    #7 0x856574 in log_tree_diff log-tree.c:955:3
    #8 0x856574 in log_tree_commit log-tree.c:986:10
    #9 0x9a9c67 in print_commit_summary sequencer.c:1329:7
    #10 0x52e623 in cmd_commit builtin/commit.c:1862:3
    #11 0x4ce83e in run_builtin git.c:475:11
    #12 0x4ccafe in handle_builtin git.c:729:3
    #13 0x4cb01c in run_argv git.c:818:4
    #14 0x4cb01c in cmd_main git.c:949:19
    #15 0x6b3f3d in main common-main.c:52:11
    #16 0x7fe397c7a349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 14 byte(s) leaked in 2 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
2b2999460c builtin/for-each-repo: remove unnecessary argv copy to plug leak
cmd_for_each_repo() copies argv into args (a strvec), which is later
passed into run_command_on_repo(), which in turn copies that strvec onto
the end of child.args. The initial copy is unnecessary (we never modify
args). We therefore choose to just pass argv directly into
run_command_on_repo(), which lets us avoid the copy and fixes the leak.

LSAN output from t0068:

Direct leak of 192 byte(s) in 1 object(s) allocated from:
    #0 0x7f63bd4ab8b0 in realloc (/usr/lib64/libasan.so.4+0xdc8b0)
    #1 0x98d7e6 in xrealloc wrapper.c:126
    #2 0x916914 in strvec_push_nodup strvec.c:19
    #3 0x916a6e in strvec_push strvec.c:26
    #4 0x4be4eb in cmd_for_each_repo builtin/for-each-repo.c:49
    #5 0x410dcd in run_builtin git.c:475
    #6 0x410dcd in handle_builtin git.c:729
    #7 0x414087 in run_argv git.c:818
    #8 0x414087 in cmd_main git.c:949
    #9 0x40e9ec in main common-main.c:52
    #10 0x7f63bc9fa349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 22 byte(s) in 2 object(s) allocated from:
    #0 0x7f63bd445e30 in __interceptor_strdup (/usr/lib64/libasan.so.4+0x76e30)
    #1 0x98d698 in xstrdup wrapper.c:29
    #2 0x916a63 in strvec_push strvec.c:26
    #3 0x4be4eb in cmd_for_each_repo builtin/for-each-repo.c:49
    #4 0x410dcd in run_builtin git.c:475
    #5 0x410dcd in handle_builtin git.c:729
    #6 0x414087 in run_argv git.c:818
    #7 0x414087 in cmd_main git.c:949
    #8 0x40e9ec in main common-main.c:52
    #9 0x7f63bc9fa349 in __libc_start_main (/lib64/libc.so.6+0x24349)

See also discussion about the original implementation below - this code
appears to have evolved from a callback explaining the double-strvec-copy
pattern, but there's no strong reason to keep that now:
  https://lore.kernel.org/git/68bbeca5-314b-08ee-ef36-040e3f3814e9@gmail.com/

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
edfc744918 builtin/submodule--helper: release unused strbuf to avoid leak
relative_url() populates sb. In the normal return path, its buffer is
detached using strbuf_detach(). However the early return path does
nothing with sb, which means that sb's memory is leaked - therefore
we add a release to avoid this leak.

The reset is also only necessary for the normal return path, hence we
move it down to after the early-return to avoid unnecessary work.

LSAN output from t0060:

Direct leak of 121 byte(s) in 1 object(s) allocated from:
    #0 0x7f31246f28b0 in realloc (/usr/lib64/libasan.so.4+0xdc8b0)
    #1 0x98d7d6 in xrealloc wrapper.c:126
    #2 0x909a60 in strbuf_grow strbuf.c:98
    #3 0x90bf00 in strbuf_vaddf strbuf.c:401
    #4 0x90c321 in strbuf_addf strbuf.c:335
    #5 0x5cb78d in relative_url builtin/submodule--helper.c:182
    #6 0x5cbe46 in resolve_relative_url_test builtin/submodule--helper.c:248
    #7 0x410dcd in run_builtin git.c:475
    #8 0x410dcd in handle_builtin git.c:729
    #9 0x414087 in run_argv git.c:818
    #10 0x414087 in cmd_main git.c:949
    #11 0x40e9ec in main common-main.c:52
    #12 0x7f3123c41349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 121 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
14c3dd817d environment: move strbuf into block to plug leak
realpath is only populated if we execute the git_work_tree_initialized
block. However that block also causes us to return early, meaning we
never actually release the strbuf in the case where we populated it.
Therefore we move all strbuf related code into the block to guarantee
that we can't leak it.

LSAN output from t0095:

Direct leak of 129 byte(s) in 1 object(s) allocated from:
    #0 0x49a9b9 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x78f585 in xrealloc wrapper.c:126:8
    #2 0x713ff4 in strbuf_grow strbuf.c:98:2
    #3 0x713ff4 in strbuf_getcwd strbuf.c:597:3
    #4 0x4f0c18 in strbuf_realpath_1 abspath.c:99:7
    #5 0x5ae4a4 in set_git_work_tree environment.c:259:3
    #6 0x6fdd8a in setup_discovered_git_dir setup.c:931:2
    #7 0x6fdd8a in setup_git_directory_gently setup.c:1235:12
    #8 0x4cb50d in get_bloom_filter_for_commit t/helper/test-bloom.c:41:2
    #9 0x4cb50d in cmd__bloom t/helper/test-bloom.c:95:3
    #10 0x4caa1f in cmd_main t/helper/test-tool.c:124:11
    #11 0x4caded in main common-main.c:52:11
    #12 0x7f0869f02349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 129 byte(s) leaked in 1 allocation(s).

It looks like this leak has existed since realpath was first added to
set_git_work_tree() in:
  3d7747e318 (real_path: remove unsafe API, 2020-03-10)

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
9fa6213731 fmt-merge-msg: free newly allocated temporary strings when done
origin starts off pointing to somewhere within line, which is owned by
the caller. Later we might allocate a new string using xmemdupz() or
xstrfmt(). To avoid leaking these new strings, we introduce a to_free
pointer - which allows us to safely free the newly allocated string when
we're done (we cannot just free origin directly as it might still be
pointing to line).

LSAN output from t0090:

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0xa71f49 in do_xmalloc wrapper.c:41:8
    #2 0xa720b0 in do_xmallocz wrapper.c:75:8
    #3 0xa720b0 in xmallocz wrapper.c:83:9
    #4 0xa720b0 in xmemdupz wrapper.c:99:16
    #5 0x8092ba in handle_line fmt-merge-msg.c:187:23
    #6 0x8092ba in fmt_merge_msg fmt-merge-msg.c:666:7
    #7 0x5ce2e6 in prepare_merge_message builtin/merge.c:1119:2
    #8 0x5ce2e6 in collect_parents builtin/merge.c:1215:3
    #9 0x5c9c1e in cmd_merge builtin/merge.c:1454:16
    #10 0x4ce83e in run_builtin git.c:475:11
    #11 0x4ccafe in handle_builtin git.c:729:3
    #12 0x4cb01c in run_argv git.c:818:4
    #13 0x4cb01c in cmd_main git.c:949:19
    #14 0x6b3fad in main common-main.c:52:11
    #15 0x7fb929620349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:19 -07:00
e394a16023 interpolate_path(): allow specifying paths relative to the runtime prefix
Ever since Git learned to detect its install location at runtime, there
was the slightly awkward problem that it was impossible to specify paths
relative to said location.

For example, if a version of Git was shipped with custom SSL
certificates to use, there was no portable way to specify
`http.sslCAInfo`.

In Git for Windows, the problem was "solved" for years by interpreting
paths starting with a slash as relative to the runtime prefix.

However, this is not correct: such paths _are_ legal on Windows, and
they are interpreted as absolute paths in the same drive as the current
directory.

After a lengthy discussion, and an even lengthier time to mull over the
problem and its best solution, and then more discussions, we eventually
decided to introduce support for the magic sequence `%(prefix)/`. If a
path starts with this, the remainder is interpreted as relative to the
detected (runtime) prefix. If built without runtime prefix support, Git
will simply interpolate the compiled-in prefix.

If a user _wants_ to specify a path starting with the magic sequence,
they can prefix the magic sequence with `./` and voilà, the path won't
be expanded.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:17:16 -07:00
a03b097d63 Use a better name for the function interpolating paths
It is not immediately clear what `expand_user_path()` means, so let's
rename it to `interpolate_path()`. This also opens the path for
interpolating more than just a home directory.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:17:16 -07:00
644e6b2c0f expand_user_path(): clarify the role of the real_home parameter
The `real_home` parameter only has an effect when expanding paths
starting with `~/`, not when expanding paths starting with `~<user>/`.
Let's make that clear.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:17:16 -07:00
789f6f226b expand_user_path(): remove stale part of the comment
In 395de250d9 (Expand ~ and ~user in core.excludesfile,
commit.template, 2009-11-17), the `user_path()` function was refactored
into the `expand_user_path()`. During that refactoring, the `buf`
parameter was lost, but the code comment above said function still talks
about it. Let's remove that stale part of the comment.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:17:16 -07:00
b7d11a0f5d tests: exercise the RUNTIME_PREFIX feature
Originally, we refrained from adding a regression test in 7b6c6496374
(system_path(): Add prefix computation at runtime if RUNTIME_PREFIX set,
2008-08-10), and in 226c0ddd0d (exec_cmd: RUNTIME_PREFIX on some POSIX
systems, 2018-04-10).

The reason was that it was deemed too tricky to test.

Turns out that it is not tricky to test at all: we simply create a
pseudo-root, copy the `git` executable into the `git/` subdirectory of
that pseudo-root, then copy a script into the `libexec/git-core/`
directory and expect that to be picked up.

As long as the trash directory is in a location where binaries can be
executed, this works.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:17:16 -07:00
bbe3165f82 submodule: drop unused sm_name parameter from show_fetch_remotes()
This parameter has not been used since the function was introduced in
8c8195e9c3 (submodule--helper: introduce add-clone subcommand,
2021-07-10).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:03:44 -07:00
b9dee075eb ref-filter: add %(rest) atom
%(rest) is a atom used for cat-file batch mode, which can split
the input lines at the first whitespace boundary, all characters
before that whitespace are considered to be the object name;
characters after that first run of whitespace (i.e., the "rest"
of the line) are output in place of the %(rest) atom.

In order to let "cat-file --batch=%(rest)" use the ref-filter
interface, add %(rest) atom for ref-filter.

Introduce the reject_atom() to reject the atom %(rest) for
"git for-each-ref", "git branch", "git tag" and "git verify-tag".

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Suggected-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:01:26 -07:00
e85fcb355a ref-filter: use non-const ref_format in *_atom_parser()
Use non-const ref_format in *_atom_parser(), which can help us
modify the members of ref_format in *_atom_parser().

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:01:26 -07:00
7121c4d4e2 ref-filter: --format=%(raw) support --perl
Because the perl language can handle binary data correctly,
add the function perl_quote_buf_with_len(), which can specify
the length of the data and prevent the data from being truncated
at '\0' to help `--format="%(raw)"` support `--perl`.

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:01:25 -07:00
bd0708c7eb ref-filter: add %(raw) atom
Add new formatting option `%(raw)`, which will print the raw
object data without any changes. It will help further to migrate
all cat-file formatting logic from cat-file to ref-filter.

The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depends on the output of the atom being
text (specifically, no embedded NULs in it).

E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
`100644 one\0...`, only the `100644 one` will be added to the buffer,
which is incorrect.

Therefore, we need to find a way to record the length of the
atom_value's member `s`. Although strbuf can already record the
string and its length, if we want to replace the type of atom_value's
member `s` with strbuf, many places in ref-filter that are filled
with dynamically allocated mermory in `v->s` are not easy to replace.
At the same time, we need to check if `v->s == NULL` in
populate_value(), and strbuf cannot easily distinguish NULL and empty
strings, but c-style "const char *" can do it. So add a new member in
`struct atom_value`: `s_size`, which can record raw object size, it
can help us add raw object data to the buffer or compare two buffers
which contain raw object data.

Note that `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, and `--perl` because if the binary raw data is passed to a
variable in such languages, these may not support arbitrary binary data
in their string variable type.

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Bagas Sanjaya <bagasdotme@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:01:25 -07:00
311d0b8e8e ref-filter: add obj-type check in grab contents
Only tag and commit objects use `grab_sub_body_contents()` to grab
object contents in the current codebase.  We want to teach the
function to also handle blobs and trees to get their raw data,
without parsing a blob (whose contents looks like a commit or a tag)
incorrectly as a commit or a tag. So it's needed to pass a
`struct expand_data *data` instread of only `void *buf` to both
`grab_sub_body_contents()` and `grab_values()` to be able to check
the object type.

Skip the block of code that is specific to handling commits and tags
early when the given object is of a wrong type to help later
addition to handle other types of objects in this function.

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:01:25 -07:00
e082631e51 merge: apply autostash if merge strategy fails
Since 'git merge' learned '--autostash' in a03b55530a (merge: teach
--autostash option, 2020-04-07), 'cmd_merge', once it is determined that
we have to create a merge commit, calls 'create_autostash' if
'--autostash' is given.

As explained in a03b55530a, and made more abvious by the tests added in
that commit, the autostash is then applied if the merge succeeds, either
directly or by committing (after conflict resolution or if '--no-commit'
was given), or if the merge is aborted with 'git merge --abort'. In some
other cases, like the user calling 'git reset --merge' or 'git merge
--quit', the autostash is not applied, but saved in the stash list.

However, there exists a scenario that creates an autostash but does not
apply nor save it to the stash list: if the chosen merge strategy
completely fails to handle the merge, i.e. 'try_merge_strategy' returns
2.

Apply the autostash in that case also. An easy way to test that is to
try to merge more than two commits but explicitely ask for the 'recursive'
merge strategy.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 15:45:40 -07:00
12510bd5da merge: apply autostash if fast-forward fails
Since 'git merge' learned '--autostash' in a03b55530a (merge: teach
--autostash option, 2020-04-07), 'cmd_merge', in the fast-forward case,
calls 'create_autostash' before calling 'checkout_fast_forward' if
'--autostash' is given.

However, if 'checkout_fast_forward' fails, the autostash is not applied
to the working tree, nor saved in the stash list, since the code simply
calls 'goto done'.

Be more helpful to the user by applying the autostash in that case.

An easy way to test a failing fast-forward is when we are merging a
branch that has a tracked file that conflicts with an untracked file in
the working tree.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 15:45:38 -07:00
fd441eb612 Documentation: define 'MERGE_AUTOSTASH'
The documentation for 'git merge --abort' and 'git merge --quit' both
mention the special ref 'MERGE_AUTOSTASH', but this ref is not formally
defined anywhere. Mention it in the description of the '--autostash'
option for 'git merge'.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 15:45:35 -07:00
9938f30d13 merge: add missing word "strategy" to a message
The variable 'best_strategy' holds the name of the merge strategy that
resulted in fewer conflicts, if several strategies were tried. When
that's the case but the best strategy was not the first one tried, we
inform the user which strategy was the "best" one before recreating the
merge and leaving the conflicted files in the tree.

This informational message is missing the word "strategy", so it shows
something like:

    Using the recursive to prepare resolving by hand.

Fix that.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 15:45:33 -07:00
d3da223f22 cache-tree: prefetch in partial clone read-tree
"git read-tree" checks the existence of the blobs referenced by the
given tree, but does not bulk prefetch them. Add a bulk prefetch.

The lack of prefetch here was noticed at $DAYJOB during a merge
involving some specific commits, but I couldn't find a minimal merge
that didn't also trigger the prefetch in check_updates() in
unpack-trees.c (and in all these cases, the lack of prefetch in
cache-tree.c didn't matter because all the relevant blobs would have
already been prefetched by then). This is why I used read-tree here to
exercise this code path.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 14:22:21 -07:00
b2896d2739 unpack-trees: refactor prefetching code
Refactor the prefetching code in unpack-trees.c into its own function,
because it will be used elsewhere in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 14:21:57 -07:00
dc1daacdcc pack-bitmap: check pack validity when opening bitmap
When pack-objects adds an entry to its list of objects to pack, it may
mark the packfile and offset that contains the file, which we can later
use to output the object verbatim.  If the packfile is deleted while we
are running (e.g., by another process running "git repack"), we may die
in use_pack() if the pack file cannot be opened.

We worked around this in 4c08018204 (pack-objects: protect against
disappearing packs, 2011-10-14) by making sure we can open the pack
before recording it as a source. This detects a pack which has already
disappeared while generating the packing list, and because we keep the
pack's file descriptor (or an mmap window) open, it means we can access
it later (unless you exceed core.packedgitlimit).

The bitmap code that was added later does not do this; it adds entries
to the packlist without checking that the packfile is still valid, and
is vulnerable to this race. It needs the same treatment as 4c08018204.

However, rather than add it in just that one spot, it makes more sense
to simply open and check the packfile when we open the bitmap.
Technically you can use the .bitmap without even looking in the .pack
file (e.g., if you are just printing a list of objects without accessing
them), but it's much simpler to do it early. That covers all later
direct uses of the pack (due to the cached descriptor) without having to
check each one directly. For example, in pack-objects we need to protect
the packlist entries, but we also access the pack directly as part of
the reuse_partial_pack_from_bitmap() feature. This patch covers both
cases.

There's no test here, because the problem is inherently racy. I
reproduced and verified the fix with this script:

  rm -rf parent.git push.git fetch.git

  push() {
    (
      cd push.git &&
      echo content >>file &&
      git add file &&
      git commit -qm "change $1" &&
      git push -q origin HEAD &&
      echo "push $1..."
    ) &&
    (
      cd parent.git &&
      git repack -ad -q &&
      echo "repack $1..."
    )
  }

  fetch() {
    rm -rf fetch.git &&
    git clone -q file://$PWD/parent.git fetch.git &&
    echo "fetch $1..."
  }

  git init --bare parent.git &&
  git --git-dir=parent.git config transfer.unpacklimit 1 &&
  git clone parent.git push.git &&
  (for i in `seq 1 1000`; do push $i || break; done) &
  pusher=$!
  (for i in `seq 1 1000`; do fetch $i || break; done) &
  fetcher=$!
  wait $fetcher
  kill $pusher

That simulates a race between a client cloning and a push triggering a
repack on the server. Without this patch, it generally fails within a
couple hundred iterations with:

  remote: fatal: packfile ./objects/pack/.tmp-1377349-pack-498afdec371232bdb99d1757872f5569331da61e.pack cannot be accessed
  error: git upload-pack: git-pack-objects died with error.
  fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
  remote: aborting due to possible repository corruption on the remote side.
  fatal: early EOF
  fatal: fetch-pack: invalid index-pack output

With this patch, it reliably runs through all thousand attempts.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 11:37:56 -07:00
2f732bf15e tr2: log parent process name
It can be useful to tell who invoked Git - was it invoked manually by a
user via CLI or script? By an IDE?  In some cases - like 'repo' tool -
we can influence the source code and set the GIT_TRACE2_PARENT_SID
environment variable from the caller process. In 'repo''s case, that
parent SID is manipulated to include the string "repo", which means we
can positively identify when Git was invoked by 'repo' tool. However,
identifying parents that way requires both that we know which tools
invoke Git and that we have the ability to modify the source code of
those tools. It cannot scale to keep up with the various IDEs and
wrappers which use Git, most of which we don't know about. Learning
which tools and wrappers invoke Git, and how, would give us insight to
decide where to improve Git's usability and performance.

Unfortunately, there's no cross-platform reliable way to gather the name
of the parent process. If procfs is present, we can use that; otherwise
we will need to discover the name another way. However, the process ID
should be sufficient to look up the process name on most platforms, so
that code may be shareable.

Git for Windows gathers similar information and logs it as a "data_json"
event. However, since "data_json" has a variable format, it is difficult
to parse effectively in some languages; instead, let's pursue a
dedicated "cmd_ancestry" event to record information about the ancestry
of the current process and a consistent, parseable way.

Git for Windows also gathers information about more than one generation
of parent. In Linux further ancestry info can be gathered with procfs,
but it's unwieldy to do so. In the interest of later moving Git for
Windows ancestry logging to the 'cmd_ancestry' event, and in the
interest of later adding more ancestry to the Linux implementation - or
of adding this functionality to other platforms which have an easier
time walking the process tree - let's make 'cmd_ancestry' accept an
array of parentage.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 13:35:20 -07:00
b7e6a41622 tr2: make process info collection platform-generic
To pave the way for non-Windows platforms to define
trace2_collect_process_info(), reorganize the stub-or-definition schema
to something which doesn't directly reference Windows.

Platforms which want to collect parent process information in the
future should:

 1. Add an implementation to compat/ (e.g. compat/somearch/procinfo.c)
 2. Add that object to COMPAT_OBJS to config.mak.uname
    (e.g. COMPAT_OBJS += compat/somearch/procinfo.o)
 3. Define HAVE_PLATFORM_PROCINFO in config.mak.uname

In the Windows case, this definition lives in
compat/win32/trace2_win32_process_info.c, which is already conditionally
added to COMPAT_OBJS; so let's add HAVE_PLATFORM_PROCINFO to hint to the
build that compat/stub/procinfo.c should not be used.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 13:35:20 -07:00
63766510a1 bundle tests: use test_cmp instead of grep
Change the bundle tests to fully compare the expected "git ls-remote"
or "git bundle list-heads" output, instead of merely grepping it.

This avoids subtle regressions in the tests. In
f62e0a39b6 (t5704 (bundle): add tests for bundle --stdin, 2010-04-19)
the "bundle --stdin <rev-list options>" test was added to make sure we
didn't include the tag.

But since the --stdin mode didn't work until 5bb0fd2cab (bundle:
arguments can be read from stdin, 2021-01-11) our grepping of
"master" (later "main") missed the important part of the test.

Namely that we should not include the "refs/tags/tag" tag in that
case. Since the test only grepped for "main" in the output we'd miss a
regression in that code.

So let's use test_cmp instead, and also in the other nearby tests
where it's easy.

This does make things a bit more verbose in the case of the test
that's checking the bundle header, since it's different under SHA1 and
SHA256. I think this makes test easier to follow.

I've got some WIP changes to extend the "git bundle" command to dump
parts of the header out, which are easier to understand if we test the
output explicitly like this.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 13:29:32 -07:00
95cf6464dd bundle tests: use ">file" not ": >file"
Change uses of ":" on the LHS of a ">" to the more commonly used
">file" pattern in t/t5607-clone-bundle.sh.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 13:29:30 -07:00
eb27b338a3 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 13:05:57 -07:00
fe3fec53a6 Merge branch 'bc/rev-list-without-commit-line'
"git rev-list" learns to omit the "commit <object-name>" header
lines from the output with the `--no-commit-header` option.

* bc/rev-list-without-commit-line:
  rev-list: add option for --pretty=format without header
2021-07-22 13:05:56 -07:00
33309e428b Merge branch 'ab/imap-send-read-everything-simplify'
Code simplification.

* ab/imap-send-read-everything-simplify:
  imap-send.c: use less verbose strbuf_fread() idiom
2021-07-22 13:05:56 -07:00
bb3a55f6d3 Merge branch 'ab/gitignore-discovery-doc'
Doc update.

* ab/gitignore-discovery-doc:
  docs: .gitignore parsing is to the top of the repo
2021-07-22 13:05:55 -07:00
dae59cb263 Merge branch 'js/ci-windows-update'
GitHub Actions / CI update.

* js/ci-windows-update:
  ci: accelerate the checkout
  ci (vs-build): build with NO_GETTEXT
  artifacts-tar: respect NO_GETTEXT
  ci (windows): transfer also the Git-tracked files to the test jobs
  ci: upgrade to using actions/{up,down}load-artifacts v2
  ci (vs-build): use `cmd` to copy the DLLs, not `powershell`
  ci: use the new GitHub Action to download git-sdk-64-minimal
2021-07-22 13:05:55 -07:00
8de2e2e41b Merge branch 'ab/send-email-optim'
"git send-email" optimization.

* ab/send-email-optim:
  perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd()
  send-email: move trivial config handling to Perl
  perl: lazily load some common Git.pm setup code
  send-email: lazily load modules for a big speedup
  send-email: get rid of indirect object syntax
  send-email: use function syntax instead of barewords
  send-email: lazily shell out to "git var"
  send-email: lazily load config for a big speedup
  send-email: copy "config_regxp" into git-send-email.perl
  send-email: refactor sendemail.smtpencryption config parsing
  send-email: remove non-working support for "sendemail.smtpssl"
  send-email tests: test for boolean variables without a value
  send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true
2021-07-22 13:05:54 -07:00
8f0c15bfb6 Merge branch 'jk/typofix'
Typofix.

* jk/typofix:
  doc/rev-list-options: fix duplicate word typo
2021-07-22 13:05:54 -07:00
f003a91f5c SubmittingPatches: replace discussion of Travis with GitHub Actions
Replace the discussion of Travis CI added in
0e5d028a7a (Documentation: add setup instructions for Travis CI,
2016-05-02) with something that covers the GitHub Actions added in
889cacb689 (ci: configure GitHub Actions for CI/PR, 2020-04-11).

The setup is trivial compared to using Travis, and it even works on
Windows (that "hopefully soon" comment was probably out-of-date on
Travis as well).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 12:53:14 -07:00
4523dc8624 SubmittingPatches: move discussion of Signed-off-by above "send"
Move the section discussing the addition of a SOB trailer above the
section that discusses generating the patch itself. This makes sense
as we don't want someone to go through the process of "git
format-patch", only to realize late that they should have used "git
commit -s" or equivalent.

This is a move-only change, no lines here are being altered, only
moved around.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 12:53:14 -07:00
6f843a3355 pull: fix handling of multiple heads
With multiple heads, we should not allow rebasing or fast-forwarding.
Make sure any fast-forward request calls out specifically the fact that
multiple branches are in play.  Also, since we cannot fast-forward to
multiple branches, fix our computation of can_ff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:30 -07:00
359ff69389 pull: update docs & code for option compatibility with rebasing
git-pull.txt includes merge-options.txt, which is written assuming
merges will happen.  git-pull has allowed rebases for many years; update
the documentation to reflect that.

While at it, pass any `--signoff` flag through to the rebase backend too
so that we don't have to document it as merge-specific.  Rebase has
supported the --signoff flag for years now as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:30 -07:00
031e2f7ae1 pull: abort by default when fast-forwarding is not possible
We have for some time shown a long warning when the user does not
specify how to reconcile divergent branches with git pull.  Make it an
error now.

Initial-patch-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:29 -07:00
adc27d6a93 pull: make --rebase and --no-rebase override pull.ff=only
Fix the last few precedence tests failing in t7601 by now implementing
the logic to have --[no-]rebase override a pull.ff=only config setting.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:29 -07:00
e4dc25ed49 pull: since --ff-only overrides, handle it first
There are both merge and rebase branches in the logic, and previously
both had to handle fast-forwarding.  Merge handled that implicitly
(because git merge handles it directly), while in rebase it was
explicit.  Given that the --ff-only flag is meant to override any
--rebase or --no-rebase, make the code reflect that by handling
--ff-only before the merge-vs-rebase logic.

It turns out that this also fixes a bug for submodules.  Previously,
when --ff-only was given, the code would run `merge --ff-only` on the
main module, and then run `submodule update --recursive --rebase` on the
submodules.  With this change, we still run `merge --ff-only` on the
main module, but now run `submodule update --recursive --checkout` on
the submodules.  I believe this better reflects the intent of --ff-only
to have it apply to both the main module and the submodules.

(Sidenote: It is somewhat interesting that all merges pass `--checkout`
to submodule update, even when `--no-ff` is specified, meaning that it
will only do fast-forward merges for submodules.  This was discussed in
commit a6d7eb2c7a ("pull: optionally rebase submodules (remote submodule
changes only)", 2017-06-23).  The same limitations apply now as then, so
we are not trying to fix this at this time.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:29 -07:00
7171221d82 Makefile: don't use "FORCE" for tags targets
Remove the "FORCE" dependency from the "tags", "TAGS" and "cscope"
targets, instead make them depend on whether or not the relevant
source files have changed.

For the cscope target we need to change it to depend on the actual
generated file while we generate while we're at it, as the next commit
will discuss we always generate a cscope.out file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 09:29:24 -07:00
d3236becec doc: pull: fix rebase=false documentation
"git pull --rebase=false" means we merge their history into ours, but
it has been described the other way around.

Cc: Stephen Haberman <stephen@exigencecorp.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
[jc: updated the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-21 16:34:06 -07:00
866a3014de test-lib tests: move "run_sub_test" to a new lib-subtest.sh
Move the "check_sub_test_lib_test()" and its sister functions to a new
lib-subtest.sh.

In the future (not in this series) I'd like to test test-lib's output
in a more targeted and smaller test, and I'll need these functions to
do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-21 16:27:04 -07:00
33e5da5b6c Merge branch 'ps/t0000-output-directory-fix' into ab/lib-subtest
* ps/t0000-output-directory-fix:
  t0000: fix test if run with TEST_OUTPUT_DIRECTORY
2021-07-21 16:26:13 -07:00
b9457af004 Merge branch 'jk/t0000-subtests-fix' into ab/lib-subtest
* jk/t0000-subtests-fix:
  t0000: clear GIT_SKIP_TESTS before running sub-tests
2021-07-21 16:26:06 -07:00
3d5fc24dae pull: abort if --ff-only is given and fast-forwarding is impossible
The warning about pulling without specifying how to reconcile divergent
branches says that after setting pull.rebase to true, --ff-only can
still be passed on the command line to require a fast-forward. Make that
actually work.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
[en: updated tests; note 3 fixes and 1 new failure]
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 21:43:12 -07:00
1d25e5bdf5 t7601: add tests of interactions with multiple merge heads and config
There were already code checking that --rebase was incompatible with
a merge of multiple heads.  However, we were sometimes throwing warnings
about lack of specification of rebase vs. merge when given multiple
heads.  Since rebasing is disallowed with multiple merge heads, that
seems like a poor warning to print; we should instead just assume
merging is wanted.

Add a few tests checking multiple merge head behavior.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 21:43:12 -07:00
be19c5ca3e t7601: test interaction of merge/rebase/fast-forward flags and options
The interaction of rebase and merge flags and options was not well
tested.  Add several tests to check for correct behavior from the
following rules:
    * --ff-only vs. --[no-]rebase
      (and the related pull.ff=only vs. pull.rebase)
    * --rebase[=!false] vs. --no-ff and --ff
      (and the related pull.rebase=!false overrides pull.ff=!only)
    * command line flags take precedence over config, except:
      * --no-rebase heeds pull.ff=!only
      * pull.rebase=!false vs --no-ff and --ff

For more details behind these rules and a larger table of individual
cases, refer to https://lore.kernel.org/git/xmqqwnpqot4m.fsf@gitster.g/
and the links found therein.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 21:43:12 -07:00
ddcb189d9d pack-bitmap: clarify comment in filter_bitmap_exclude_type()
The code that eventually became filter_bitmap_exclude_type() was
originally introduced in 4f3bd5606a (pack-bitmap: implement BLOB_NONE
filtering, 2020-02-14) to accelerate BLOB_NONE filters with bitmaps.

In 856e12c18a (pack-bitmap.c: make object filtering functions generic,
2020-05-04), it became filter_bitmap_exclude_type(). But not all of the
comments were updated to be agnostic to the provided type.

Remove the remaining comments which should have been updated in
856e12c18a to reflect the type-agnostic nature of the function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 15:04:07 -07:00
e05cdb17e8 unpack-trees: resolve sparse-directory/file conflicts
When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.

There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.

When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.

However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().

That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:59:11 -07:00
70569fadce t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.

We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.

Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.

Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.

At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:59:11 -07:00
878b399734 doc: clarify description of 'submodule.recurse'
The doc for 'submodule.recurse' starts with "Specifies if commands
recurse into submodles by default". This is not exactly true of all
commands that have a '--recurse-submodules' option. For example, 'git
pull --recurse-submodules' does not run 'git pull' in each submodule,
but rather runs 'git submodule update --recursive' so that the submodule
working trees after the pull matches the commits recorded in the
superproject.

Clarify that by just saying that it enables '--recurse-submodules'.

Note that the way this setting interacts with 'fetch.recurseSubmodules'
and 'push.recurseSubmodules', which can have other values than true or
false, is already documented since 4da9e99e6e (doc: be more precise on
(fetch|push).recurseSubmodules, 2020-04-06).

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:57:43 -07:00
734283855f doc/git-config: simplify "override" advice for FILES section
At the end of the FILES section, we indicate that you can override the
regular lookup rules with --global, etc. But:

  - we're missing the --local option

  - we point to GIT_CONFIG instead of --file, but the latter has much
    better documentation

  - we're vague about how the overrides work; the actual option
    descriptions are much better here

So let's just mention the names and point people back to the OPTIONS
section. We could perhaps even delete this paragraph entirely, but the
presence of the names may give people reading FILES a clue about where
to look for more information.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:55:06 -07:00
b3b186262f doc/git-config: clarify GIT_CONFIG environment variable
The scope and utility of the GIT_CONFIG variable was drastically reduced
by dc87183189 (Only use GIT_CONFIG in "git config", not other programs,
2008-06-30). But the documentation in git-config(1) predates that, which
makes it rather misleading.

These days it is really just another way to say "--file". So let's say
that, and explicitly make it clear that it does not impact other Git
commands (like GIT_CONFIG_SYSTEM, etc, would).

I also bumped it to the bottom of the list of variables, and warned
people off of using it. We don't have any plans for deprecation at this
point, but there's little point in encouraging people to use it by
putting it at the top of the list.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:55:06 -07:00
4bb9eb5f91 doc/git-config: explain --file instead of referring to GIT_CONFIG
The explanation for the --file option only refers to GIT_CONFIG. This
redirection to an environment variable is confusing, but doubly so
because the description of GIT_CONFIG is out of date.

Let's describe --file from scratch, detailing both the reading and
writing behavior as we do for other similar options like --system, etc.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:55:06 -07:00
8b09a900a1 merge-ort: restart merge with cached renames to reduce process entry cost
The merge algorithm mostly consists of the following three functions:
   collect_merge_info()
   detect_and_process_renames()
   process_entries()
Prior to the trivial directory resolution optimization of the last half
dozen commits, process_entries() was consistently the slowest, followed
by collect_merge_info(), then detect_and_process_renames().  When the
trivial directory resolution applies, it often dramatically decreases
the amount of time spent in the two slower functions.

Looking at the performance results in the previous commit, the trivial
directory resolution optimization helps amazingly well when there are no
relevant renames.  It also helps really well when reapplying a long
series of linear commits (such as in a rebase or cherry-pick), since the
relevant renames may well be cached from the first reapplied commit.
But when there are any relevant renames that are not cached (represented
by the just-one-mega testcase), then the optimization does not help at
all.

Often, I noticed that when the optimization does not apply, it is
because there are a handful of relevant sources -- maybe even only one.
It felt frustrating to need to recurse into potentially hundreds or even
thousands of directories just for a single rename, but it was needed for
correctness.

However, staring at this list of functions and noticing that
process_entries() is the most expensive and knowing I could avoid it if
I had cached renames suggested a simple idea: change
   collect_merge_info()
   detect_and_process_renames()
   process_entries()
into
   collect_merge_info()
   detect_and_process_renames()
   <cache all the renames, and restart>
   collect_merge_info()
   detect_and_process_renames()
   process_entries()

This may seem odd and look like more work.  However, note that although
we run collect_merge_info() twice, the second time we get to employ
trivial directory resolves, which makes it much faster, so the increased
time in collect_merge_info() is small.  While we run
detect_and_process_renames() again, all renames are cached so it's
nearly a no-op (we don't call into diffcore_rename_extended() but we do
have a little bit of data structure checking and fixing up).  And the
big payoff comes from the fact that process_entries(), will be much
faster due to having far fewer entries to process.

This restarting only makes sense if we can save recursing into enough
directories to make it worth our while.  Introduce a simple heuristic to
guide this.  Note that this heuristic uses a "wanted_factor" that I have
virtually no actual real world data for, just some back-of-the-envelope
quasi-scientific calculations that I included in some comments and then
plucked a simple round number out of thin air.  It could be that
tweaking this number to make it either higher or lower improves the
optimization.  (There's slightly more here; when I first introduced this
optimization, I used a factor of 10, because I was completely confident
it was big enough to not cause slowdowns in special cases.  I was
certain it was higher than needed.  Several months later, I added the
rough calculations which make me think the optimal number is close to 2;
but instead of pushing to the limit, I just bumped it to 3 to reduce the
risk that there are special cases where this optimization can result in
slowing down the code a little.  If the ratio of path counts is below 3,
we probably will only see minor performance improvements at best
anyway.)

Also, note that while the diffstat looks kind of long (nearly 100
lines), more than half of it is in two comments explaining how things
work.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:      205.1  ms ±  3.8  ms   204.2  ms ±  3.0  ms
    mega-renames:      1.564 s ±  0.010 s     1.076 s ±  0.015 s
    just-one-mega:   479.5  ms ±  3.9  ms   364.1  ms ±  7.0  ms

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:47:40 -07:00
7bee6c1004 merge-ort: avoid recursing into directories when we don't need to
This combines the work of the last several patches, and implements the
conditions when we don't need to recurse into directories.  It's perhaps
easiest to see the logic by separating the fact that a directory might
have both rename sources and rename destinations:

  * rename sources: only files present in the merge base can serve as
    rename sources, and only when one side deletes that file.  When the
    tree on one side matches the merge base, that means every file
    within the subtree matches the merge base.  This means that the
    skip-irrelevant-rename-detection optimization from before kicks in
    and we don't need any of these files as rename sources.

  * rename destinations: the tree that does not match the merge base
    might have newly added and hence unmatched destination files.
    This is what usually prevents us from doing trivial directory
    resolutions in the merge machinery.  However, the fact that we have
    deferred recursing into this directory until the end means we know
    whether there are any unmatched relevant potential rename sources
    elsewhere in this merge.  If there are no unmatched such relevant
    sources anywhere, then there is no need to look for unmatched
    potential rename destinations to match them with.

This informs our algorithm:
  * Search through relevant_sources; if we have entries, they better all
    be reflected in cached_pairs or cached_irrelevant, otherwise they
    represent an unmatched potential rename source (causing the
    optimization to be disallowed).
  * For any relevant_source represented in cached_pairs, we do need to
    to make sure to get the destination for each source, meaning we need
    to recurse into any ancestor directories of those destinations.
  * Once we've recursed into all the rename destinations for any
    relevant_sources in cached_pairs, we can then do the trivial
    directory resolution for the remaining directories.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        5.235 s ±  0.042 s   205.1  ms ±  3.8  ms
    mega-renames:      9.419 s ±  0.107 s     1.564 s ±  0.010 s
    just-one-mega:   480.1  ms ±  3.9  ms   479.5  ms ±  3.9  ms

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:47:40 -07:00
5e1ca57a7b merge-ort: defer recursing into directories when merge base is matched
When one side of history matches the merge base (including when the
merge base has no entry for the given directory), have
collect_merge_info_callback() defer recursing into the directory.  To
ensure those entries are eventually handled, add a call to
handled_deferred_entries() in collect_merge_info() after
traverse_trees() returns.

Note that the condition in collect_merge_info_callback() may look more
complicated than necessary at first glance;
renames->trivial_merges_okay[side] is always true until
handle_deferred_entries() is called, and possible_trivial_merges[side]
is always empty right now (and in the future won't be filled until
handle_deferred_entries() is called).  However, when
handle_deferred_entries() calls traverse_trees() for the relevant
deferred directories, those traverse_trees() calls will once again end
up in collect_merge_info_callback() for all the entries under those
subdirectories.  The extra conditions are there for such deferred cases
and will be used more as we do more with those variables.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:47:39 -07:00
e0ef578eae merge-ort: add a handle_deferred_entries() helper function
In order to allow trivial directory resolution, we first need to be able
to gather more information to determine if the optimization is safe.  To
enable that, we need a way of deferring the recursion into the directory
until a later time.  Naturally, deferring the entry into a subtree means
that we need some function that will later recurse into the subdirectory
exactly the same way that collect_merge_info_callback() would have done.

Add a helper function that does this.  For now this function is not used
but a subsequent commit will change that.  Future commits will also make
the function sometimes resolve directories instead of traversing inside.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:47:39 -07:00
d478f56759 merge-ort: add data structures for allowable trivial directory resolves
As noted a few commits ago, we can resolve individual files early if all
three sides of the merge have a file at the path and two of the three
sides match.  We would really like to do the same thing with
directories, because being able to do a trivial directory resolve means
we don't have to recurse into the directory, potentially saving us a
huge amount of time in both collect_merge_info() and process_entries().
Unfortunately, resolving directories early would mean missing any
renames whose source or destination is underneath that directory.

If we somehow knew there weren't any renames under the directory in
question, then we could resolve it early.  Sadly, it is impossible to
determine whether there are renames under the directory in question
without recursing into it, and this has traditionally kept us from ever
implementing such an optimization.

In commit f89b4f2bee ("merge-ort: skip rename detection entirely if
possible", 2021-03-11), we added an additional reason that rename
detection could be skipped entirely -- namely, if no *relevant* sources
were present.  Without completing collect_merge_info_callback(), we do
not yet know if there are no relevant sources.  However, we do know that
if the current directory on one side matches the merge base, then every
source file within that directory will not be RELEVANT_CONTENT, and a
few simple checks can often let us rule out RELEVANT_LOCATION as well.
This suggests we can just defer recursing into such directories until
the end of collect_merge_info.

Since the deferred directories are known to not add any relevant sources
due to the above properties, then if there are no relevant sources after
we've traversed all paths other than the deferred ones, then we know
there are not any relevant sources.  Under those conditions, rename
detection is unnecessary, and that means we can resolve the deferred
directories without recursing into them.

Note that the logic for skipping rename detection was also modified
further in commit 76e253793c ("merge-ort, diffcore-rename: employ cached
renames when possible", 2021-01-30); in particular rename detection can
be skipped if we already have cached renames for each relevant source.
We can take advantage of this information as well with our deferral of
recursing into directories where one side matches the merge base.

Add some data structures that we will use to do these deferrals, with
some lengthy comments explaining their purpose.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:47:39 -07:00
528fc51b6d merge-ort: add some more explanations in collect_merge_info_callback()
The previous patch possibly raises some questions about whether
additional cases in collect_merge_info_callback() can be handled early.
Add some explanations in the form of comments to help explain these
better.  While we're at it, add a few comments to denote what a few
boolean '0' or '1' values stand for.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:47:39 -07:00
785bf2088e merge-ort: resolve paths early when we have sufficient information
When there are no directories involved at a given path, and all three
sides have a file at that path, and two of the three sides of history
match, we can immediately resolve the merge of that path in
collect_merge_info() and do not need to wait until process_entries().

This is actually a very minor improvement: half the time when I run it,
I see an improvement; the other half a slowdown.  It seems to be in the
range of noise.  However, this idea serves as the beginning of some
bigger optimizations coming in the following patches.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 14:47:39 -07:00
ade1552598 t0000: fix test if run with TEST_OUTPUT_DIRECTORY
Testcases in t0000 are quite special given that they many of them run
nested testcases to verify that testing functionality itself works as
expected. These nested testcases are realized by writing a new ad-hoc
test script which again sources test-lib.sh, where the new script is
created in a nested subdirectory located beneath the current trash
directory. We then execute the new test script with the nested
subdirectory as current working directory and explicitly re-export
TEST_OUTPUT_DIRECTORY to point to that directory.

While this works as expected in the general case, it falls apart when
the developer has TEST_OUTPUT_DIRECTORY explicitly defined either via
the environment or via config.mak and runs "make test". In that case,
test-lib.sh will clobber the value that we've just carefully set up to
instead contain what the developer has defined. As a result, the
TEST_OUTPUT_DIRECTORY continues to point at the root output directory,
not at the nested one.

This issue causes breakage in the 'test_atexit is run' test case: the
nested test case writes files into "../../", which is assumed to be the
parent's trash directory. But because TEST_OUTPUT_DIRECTORY already
points to to the root output directory, we instead end up writing those
files outside of the output directory. The parent test case will then
try to check whether those files still exist in its own trash directory,
which thus must fail now.

Fix the issue by adding a new TEST_OUTPUT_DIRECTORY_OVERRIDE variable.
If set, then we'll always override the TEST_OUTPUT_DIRECTORY with its
value after sourcing GIT-BUILD-OPTIONS.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 09:19:02 -07:00
88617d11f9 multi-pack-index: fix potential segfault without sub-command
Since cd57bc41bb (builtin/multi-pack-index.c: display usage on
unrecognized command, 2021-03-30) we have used a "usage" label to avoid
having two separate callers of usage_with_options (one when no arguments
are given, and another for unrecognized sub-commands).

But the first caller has been broken since cd57bc41bb, since it will
happily jump to usage without arguments, and then pass argv[0] to the
"unrecognized subcommand" error.

Many compilers will save us from a segfault here, but the end result is
ugly, since it mentions an unrecognized subcommand when we didn't even
pass one, and (on GCC) includes "(null)" in its output.

Move the "usage" label down past the error about unrecognized
subcommands so that it is only triggered when it should be. While we're
at it, bulk up our test coverage in this area, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-19 15:24:01 -07:00
c510928a25 refs/debug: quote prefix
This makes the empty prefix ("") stand out better.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-19 14:32:34 -07:00
ac223c4047 t0000: clear GIT_SKIP_TESTS before running sub-tests
In t0000, we run several fake "sub-test" suites to verify the behavior
of the test suite. But because we don't clear the parent environment
completely, the sub-tests can be fooled by variables meant for the
parent. For example:

  GIT_SKIP_TESTS=t1234 ./t0000-basic.sh

fails when a sub-test expects its fake t1234 to actually run. This
particular pattern is unlikely in practice; we're running a single
script, and there is no t1234 in the real test suite anyway (not yet, at
least). A more real-world example is:

  GIT_SKIP_TESTS=t[^0]* make test

to run only the t0* tests.

The fix is conceptually simple: we should clear the GIT_SKIP_TESTS
variable when running the sub-tests, because its contents (if any) will
be meant for the main test suite. This is easy to do centrally in our
sub-test helper.

But there's a catch: some of our tests do set GIT_SKIP_TESTS
intentionally to test the feature. We need to allow them to continue to
set it, but clear it for all the other tests. And the sub-test helper
can't tell if the GIT_SKIP_TESTS it sees is from a test or not. We can
handle this by adding a new option to the helper to let callers specify
the skip list.

I considered adding a more general "--eval" option to let callers set up
the env for the sub-test however they like. That would cover this case
and possible future ones. But the quoting gets awkward for the callers
(since we're now 2 layers deep in evals!), so I went with the simpler
more specific solution.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-19 13:26:00 -07:00
64f0109f17 test-lib-functions: use test-tool for [de]packetize()
The shell+perl "[de]packetize()" helper functions were added in
4414a15002 (t/lib-git-daemon: add network-protocol helpers,
2018-01-24), and around the same time we added the "pkt-line" helper
in 74e7002961 (test-pkt-line: introduce a packet-line test helper,
2018-03-14).

For some reason it seems we've mostly used the shell+perl version
instead of the helper since then. There were discussions around
88124ab263 (test-lib-functions: make packetize() more efficient,
2020-03-27) and cacae4329f (test-lib-functions: simplify packetize()
stdin code, 2020-03-29) to improve them and make them more efficient.

There was one good reason to do so, we needed an equivalent of
"test-tool pkt-line pack", but that command wasn't capable of handling
input with "\n" (a feature) or "\0" (just because it happens to be
printf-based under the hood).

Let's add a "pkt-line-raw" helper for that, and expose is at a
packetize_raw() to go with the existing packetize() on the shell
level, this gives us the smallest amount of change to the tests
themselves.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-19 11:53:50 -07:00
daab8a564f The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-16 17:42:53 -07:00
8e62a85352 Merge branch 'ds/gender-neutral-doc'
Update the documentation not to assume users are of certain gender
and adds to guidelines to do so.

* ds/gender-neutral-doc:
  *: fix typos
  comments: avoid using the gender of our users
  doc: avoid using the gender of other people
2021-07-16 17:42:53 -07:00
8721e2eaed Merge branch 'jt/partial-clone-submodule-1'
Prepare the internals for lazily fetching objects in submodules
from their promisor remotes.

* jt/partial-clone-submodule-1:
  promisor-remote: teach lazy-fetch in any repo
  run-command: refactor subprocess env preparation
  submodule: refrain from filtering GIT_CONFIG_COUNT
  promisor-remote: support per-repository config
  repository: move global r_f_p_c to repo struct
2021-07-16 17:42:53 -07:00
bd4232fac3 Merge branch 'ab/struct-init'
Code cleanup around struct_type_init() functions.

* ab/struct-init:
  string-list.h users: change to use *_{nodup,dup}()
  string-list.[ch]: add a string_list_init_{nodup,dup}()
  dir.[ch]: replace dir_init() with DIR_INIT
  *.c *_init(): define in terms of corresponding *_INIT macro
  *.h: move some *_INIT to designated initializers
2021-07-16 17:42:53 -07:00
832a239b72 Merge branch 'dd/test-stdout-count-lines'
Tiny test clean-up.

* dd/test-stdout-count-lines:
  t6402: preserve git exit status code
  t6400: preserve git ls-files exit status code
  test-lib-functions: introduce test_stdout_line_count
2021-07-16 17:42:52 -07:00
c4670b8a8d Merge branch 'hn/refs-test-cleanup'
Test clean-up.

* hn/refs-test-cleanup:
  t7509: avoid direct file access for writing CHERRY_PICK_HEAD
  t1415: avoid direct filesystem access for writing refs
2021-07-16 17:42:52 -07:00
a91e0bb833 Merge branch 'rs/khash-alloc-cleanup'
Code clean-up.

* rs/khash-alloc-cleanup:
  khash: clarify that allocations never fail
2021-07-16 17:42:52 -07:00
8eb90d385c Merge branch 'ar/help-micro-cleanup'
Tiny code clean-up.

* ar/help-micro-cleanup:
  help: convert git_cmd to page in one place
2021-07-16 17:42:51 -07:00
f90efd9981 Merge branch 'ar/submodule-helper-include-cleanup'
Code clean-up.

* ar/submodule-helper-include-cleanup:
  submodule--helper: remove redundant include
2021-07-16 17:42:51 -07:00
cdeabf513a Merge branch 'ab/bundle-updates'
Code clean-up and leak plugging in "git bundle".

* ab/bundle-updates:
  bundle: remove "ref_list" in favor of string-list.c API
  bundle.c: use a temporary variable for OIDs and names
  bundle cmd: stop leaking memory from parse_options_cmd_bundle()
2021-07-16 17:42:49 -07:00
f0ade787ac Merge branch 'hn/refs-iterator-peel-returns-boolean'
Tiny API tweak.

* hn/refs-iterator-peel-returns-boolean:
  refs: make explicit that ref_iterator_peel returns boolean
2021-07-16 17:42:49 -07:00
3cc43bff9c Merge branch 'ab/mktag-tests'
Fill test gaps.

* ab/mktag-tests:
  mktag tests: test fast-export
  mktag tests: test for-each-ref
  mktag tests: test update-ref and reachable fsck
  mktag tests: test hash-object --literally and unreachable fsck
  mktag tests: invert --no-strict test
  mktag tests: parse out options in helper
2021-07-16 17:42:48 -07:00
1fb3445658 Merge branch 'ab/show-branch-tests'
Fill test gaps.

* ab/show-branch-tests:
  show-branch tests: add missing tests
  show-branch: don't <COLOR></RESET> for space characters
  show-branch tests: modernize test code
  show-branch tests: rename the one "show-branch" test file
2021-07-16 17:42:48 -07:00
b2fc822629 Merge branch 'ab/fetch-negotiate-segv-fix'
Code recently added to support common ancestry negotiation during
"git push" did not sanity check its arguments carefully enough.

* ab/fetch-negotiate-segv-fix:
  fetch: fix segfault in --negotiate-only without --negotiation-tip=*
  fetch: document the --negotiate-only option
  send-pack.c: move "no refs in common" abort earlier
2021-07-16 17:42:48 -07:00
368cab75c1 Merge branch 'ab/make-delete-on-error'
Use ".DELETE_ON_ERROR" pseudo target to simplify our Makefile.

* ab/make-delete-on-error:
  Makefile: add and use the ".DELETE_ON_ERROR" flag
2021-07-16 17:42:47 -07:00
a93c6fd677 Merge branch 'ew/mmap-failures'
Error message update.

* ew/mmap-failures:
  xmmap: inform Linux users of tuning knobs on ENOMEM
2021-07-16 17:42:47 -07:00
fba551379e Merge branch 'js/config-mak-windows-pcre-fix'
Whitespace fix.

* js/config-mak-windows-pcre-fix:
  config.mak.uname: PCRE1 cleanup
2021-07-16 17:42:47 -07:00
bc34e5227b Merge branch 'js/gfw-system-config-loc-fix'
Update the location of system-side configuration file on Windows.

* js/gfw-system-config-loc-fix:
  config: normalize the path of the system gitconfig
  cmake(windows): set correct path to the system Git config
  mingw: move Git for Windows' system config where users expect it
2021-07-16 17:42:46 -07:00
508416d95c Merge branch 'ks/submodule-cleanup'
Code cleanup.

* ks/submodule-cleanup:
  submodule: remove unnecessary `prefix` based option logic
2021-07-16 17:42:46 -07:00
3b57e72c0c Merge branch 'tb/midx-use-checksum'
When rebuilding the multi-pack index file reusing an existing one,
we used to blindly trust the existing file and ended up carrying
corrupted data into the updated file, which has been corrected.

* tb/midx-use-checksum:
  midx: report checksum mismatches during 'verify'
  midx: don't reuse corrupt MIDXs when writing
  commit-graph: rewrite to use checksum_valid()
  csum-file: introduce checksum_valid()
2021-07-16 17:42:46 -07:00
d3b88be1b4 Merge branch 'en/merge-dir-rename-corner-case-fix'
The merge code had funny interactions between content based rename
detection and directory rename detection.

* en/merge-dir-rename-corner-case-fix:
  merge-recursive: handle rename-to-self case
  merge-ort: ensure we consult df_conflict and path_conflicts
  t6423: test directory renames causing rename-to-self
2021-07-16 17:42:45 -07:00
fdbcdfcf61 Merge branch 'en/ort-perf-batch-13'
Performance tweaks of "git merge -sort" around lazy fetching of objects.

* en/ort-perf-batch-13:
  merge-ort: add prefetching for content merges
  diffcore-rename: use a different prefetch for basename comparisons
  diffcore-rename: allow different missing_object_cb functions
  t6421: add tests checking for excessive object downloads during merge
  promisor-remote: output trace2 statistics for number of objects fetched
2021-07-16 17:42:45 -07:00
89efac81c7 Merge branch 'en/ort-perf-batch-12'
More fix-ups and optimization to "merge -sort".

* en/ort-perf-batch-12:
  merge-ort: miscellaneous touch-ups
  Fix various issues found in comments
  diffcore-rename: avoid unnecessary strdup'ing in break_idx
  merge-ort: replace string_list_df_name_compare with faster alternative
2021-07-16 17:42:45 -07:00
5b1cd37e44 CodingGuidelines: recommend gender-neutral description
Technical writing seeks to convey information with minimal
friction. One way that a reader can experience friction is if they
encounter a description of "a user" that is later simplified using a
gendered pronoun. If the reader does not consider that pronoun to
apply to them, then they can experience cognitive dissonance that
removes focus from the information.

Give some basic tips to guide us avoid unnecessary uses of gendered
description.

Using a gendered pronoun is appropriate when referring to a specific
person.

There are acceptable existing uses of gendered pronouns within the
Git codebase, such as:

* References to real people (e.g. Linus Torvalds, "the Git maintainer").
  Do not misgender real people. If there is any doubt to the gender of a
  person, then avoid using pronouns.

* References to fictional people with clear genders (e.g. Alice and
  Bob).

* Sample text used in test cases (e.g t3702, t6432).

* The official text of the GPL license contains uses of "he or she",
  but using singular "they" (or modifying the text in some other
  way) is not within the scope of the Git project.

* Literal email messages in Documentation/howto/ should not be edited
  for grammatical concerns such as this, unless we update the entire
  document to fit the standard documentation format. If such an effort is
  taken on, then the authorship would change and no longer refer to the
  exact mail message.

* External projects consumed in contrib/ should not deviate solely for
  style reasons. Recommended edits should be contributed to those
  projects directly.

Other cases within the Git project were cleaned up by the previous
changes.

Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-16 11:35:46 -07:00
ca2d62b787 parse-options: don't complete option aliases by default
Since 'OPT_ALIAS' was created in 5c387428f1 (parse-options: don't emit
"ambiguous option" for aliases, 2019-04-29), 'git clone
--git-completion-helper', which is used by the Bash completion script to
list options accepted by clone (via '__gitcomp_builtin'), lists both
'--recurse-submodules' and its alias '--recursive', which was not the
case before since '--recursive' had the PARSE_OPT_HIDDEN flag set, and
options with this flag are skipped by 'parse-options.c::show_gitcomp',
which implements 'git <cmd> --git-completion-helper'.

This means that typing 'git clone --recurs<TAB>' will yield both
'--recurse-submodules' and '--recursive', which is not ideal since both
do the same thing, and so the completion should directly complete the
canonical option.

At the point where 'show_gitcomp' is called in 'parse_options_step',
'preprocess_options' was already called in 'parse_options', so any
aliases are now copies of the original options with a modified help text
indicating they are aliases.

Helpfully, since 64cc539fd2 (parse-options: don't leak alias help
messages, 2021-03-21) these copies have the PARSE_OPT_FROM_ALIAS flag
set, so check that flag early in 'show_gitcomp' and do not print them,
unless the user explicitely requested that *all* completion be shown (by
setting 'GIT_COMPLETION_SHOW_ALL'). After all, if we want to encourage
the use of '--recurse-submodules' over '--recursive', we'd better just
suggest the former.

The only other options alias is 'log' and friends' '--mailmap', which is
an alias for '--use-mailmap', but the Bash completion helpers for these
commands do not use '__gitcomp_builtin', and thus are unnaffected by
this change.

Test the new behaviour in t9902-completion.sh. As a side effect, this
also tests the correct behaviour of GIT_COMPLETION_SHOW_ALL, which was
not tested before. Note that since '__gitcomp_builtin' caches the
options it shows, we need to re-source the completion script to clear
that cache for the second test.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-16 11:31:44 -07:00
94b82d5686 rename: bump limit defaults yet again
These were last bumped in commit 92c57e5c1d (bump rename limit
defaults (again), 2011-02-19), and were bumped both because processors
had gotten faster, and because people were getting ugly merges that
caused problems and reporting it to the mailing list (suggesting that
folks were willing to spend more time waiting).

Since that time:
  * Linus has continued recommending kernel folks to set
    diff.renameLimit=0 (maps to 32767, currently)
  * Folks with repositories with lots of renames were happy to set
    merge.renameLimit above 32767, once the code supported that, to
    get correct cherry-picks
  * Processors have gotten faster
  * It has been discovered that the timing methodology used last time
    probably used too large example files.

The last point is probably worth explaining a bit more:

  * The "average" file size used appears to have been average blob size
    in the linux kernel history at the time (probably v2.6.25 or
    something close to it).
  * Since bigger files are modified more frequently, such a computation
    weights towards larger files.
  * Larger files may be more likely to be modified over time, but are
    not more likely to be renamed -- the mean and median blob size
    within a tree are a bit higher than the mean and median of blob
    sizes in the history leading up to that version for the linux
    kernel.
  * The mean blob size in v2.6.25 was half the average blob size in
    history leading to that point
  * The median blob size in v2.6.25 was about 40% of the mean blob size
    in v2.6.25.
  * Since the mean blob size is more than double the median blob size,
    any file as big as the mean will not be compared to any files of
    median size or less (because they'd be more than 50% dissimilar).
  * Since it is the number of files compared that provides the O(n^2)
    behavior, median-sized files should matter more than mean-sized
    ones.

The combined effect of the above is that the file size used in past
calculations was likely about 5x too large.  Combine that with a CPU
performance improvement of ~30%, and we can increase the limits by
a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated
time limits.

Keeping the same approximate time limit probably makes sense for
diff.renameLimit (there is no progress feedback in e.g. git log -p),
but the experience above suggests merge.renameLimit could be extended
significantly.  In fact, it probably would make sense to have an
unlimited default setting for merge.renameLimit, but that would
likely need to be coupled with changes to how progress is displayed.
(See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/
for details in that area.)  For now, let's just bump the approximate
time limit from 10s to 1m.

(Note: We do not want to use actual time limits, because getting results
that depend on how loaded your system is that day feels bad, and because
we don't discover that we won't get all the renames until after we've
put in a lot of work rather than just upfront telling the user there are
too many files involved.)

Using the original time limit of 2s for diff.renameLimit, and bumping
merge.renameLimit from 10s to 60s, I found the following timings using
the simple script at the end of this commit message (on an AWS c5.xlarge
which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"):

      N   Timing
   1300    1.995s
   7100   59.973s

So let's round down to nice even numbers and bump the limits from
400->1000, and from 1000->7000.

Here is the measure_rename_perf script (adapted from
https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/
in particular to avoid triggering the linear handling from
basename-guided rename detection):

    #!/bin/bash

    n=$1; shift

    rm -rf repo
    mkdir repo && cd repo
    git init -q -b main

    mkdata() {
      mkdir $1
      for i in `seq 1 $2`; do
        (sed "s/^/$i /" <../sample
         echo tag: $1
        ) >$1/$i
      done
    }

    mkdata initial $n
    git add .
    git commit -q -m initial

    mkdata new $n
    git add .
    cd new
    for i in *; do git mv $i $i.renamed; done
    cd ..
    git rm -q -rf initial
    git commit -q -m new

    time git diff-tree -M -l0 --summary HEAD^ HEAD

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 16:54:34 -07:00
9dd29dbef0 diffcore-rename: treat a rename_limit of 0 as unlimited
In commit 89973554b5 (diffcore-rename: make diff-tree -l0 mean
-l<large>, 2017-11-29), -l0 was given a special magical "large" value,
but one which was not large enough for some uses (as can be seen from
commit 9f7e4bfa3b (diff: remove silent clamp of renameLimit,
2017-11-13).  Make 0 (or a negative value) be treated as unlimited
instead and update the documentation to mention this.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 16:54:24 -07:00
6623a528e0 doc: clarify documentation for rename/copy limits
A few places in the docs implied that rename/copy detection is always
quadratic or that all (unpaired) files were involved in the quadratic
portion of rename/copy detection.  The following two commits each
introduced an exception to this:

    9027f53cb5 (Do linear-time/space rename logic for exact renames,
                  2007-10-25)
    bd24aa2f97 (diffcore-rename: guide inexact rename detection based
                  on basenames, 2021-02-14)

(As a side note, for copy detection, the basename guided inexact rename
detection is turned off and the exact renames will only result in
sources (without the dests) being removed from the set of files used in
quadratic detection.  So, for copy detection, the documentation was
closer to correct.)

Avoid implying that all files involved in rename/copy detection are
subject to the full quadratic algorithm.  While at it, also note the
default values for all these settings.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 16:54:24 -07:00
05d2c61c67 diff: correct warning message when renameLimit exceeded
The warning when quadratic rename detection was skipped referred to
"inexact rename detection".  For years, the only linear portion of
rename detection was looking for exact renames, so "inexact rename
detection" was an accurate way to refer to the quadratic portion of
rename detection.  However, that changed with commit bd24aa2f97
(diffcore-rename: guide inexact rename detection based on basenames,
2021-02-14).  Let's instead use the term "exhaustive rename detection"
to refer to the quadratic portion.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 16:54:24 -07:00
0db4961c49 worktree: teach add to accept --reason <string> with --lock
The default reason stored in the lock file, "added with --lock",
is unlikely to be what the user would have given in a separate
`git worktree lock` command. Allowing `--reason` to be specified
along with `--lock` when adding a working tree gives the user control
over the reason for locking without needing a second command.

Signed-off-by: Stephen Manz <smanz@alum.mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 13:30:59 -07:00
82823118b9 fetch: die on invalid --negotiation-tip hash
If a full hexadecimal hash is given as a --negotiation-tip to "git
fetch", and that hash does not correspond to an object, "git fetch" will
segfault if --negotiate-only is given and will silently ignore that hash
otherwise. Make these cases fatal errors, just like the case when an
invalid ref name or abbreviated hash is given.

While at it, mark the error messages as translatable.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 11:58:52 -07:00
54a03bc7d9 send-pack: fix push nego. when remote has refs
Commit 477673d6f3 ("send-pack: support push negotiation", 2021-05-05)
did not test the case in which a remote advertises at least one ref. In
such a case, "remote_refs" in get_commons_through_negotiation() in
send-pack.c would also contain those refs with a zero ref->new_oid (in
addition to the refs being pushed with a nonzero ref->new_oid). Passing
them as negotiation tips to "git fetch" causes an error, so filter them
out.

(The exact error that would happen in "git fetch" in this case is a
segmentation fault, which is unwanted. This will be fixed in the
subsequent commit.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 11:58:52 -07:00
74fab8ff54 send-pack: fix push.negotiate with remote helper
Commit 477673d6f3 ("send-pack: support push negotiation", 2021-05-05)
introduced the push.negotiate config variable and included a test. The
test only covered pushing without a remote helper, so the fact that
pushing with a remote helper doesn't work went unnoticed.

This is ultimately caused by the "url" field not being set in the args
struct. This field being unset probably went unnoticed because besides
push negotiation, this field is only used to generate a "pushee" line in
a push cert (and if not given, no such line is generated). Therefore,
set this field.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 11:58:52 -07:00
a066a90db6 ci(check-whitespace): restrict to the intended commits
During a run of the `check-whitespace` we want to verify that the
commits introduced in the Pull Request have no whitespace issues. We
only want to look at those commits, not the upstream commits (because
the contributor cannot do anything about the latter).

However, by using the `-<count>` form in `git log --check`, we run the
risk of looking at the wrong commits. The reason is that the
`actions/checkout` step does _not_ check out the tip commit of the Pull
Request's branch: Instead, it checks out a merge commit that merges that
branch into the target branch. For that reason, we already adjust the
commit count by incrementing it, but that is not enough: if the upstream
branch has newer commits, they are traversed _first_. And obviously we
will then miss some of the commits that we _actually_ wanted to look at.

Therefore, let's be careful to stop assuming a linear, up to date commit
topology in the contributed commits, and instead specify the correct
commit range.

Unfortunately, this means that we no longer can rely on a shallow clone:
There is no way of knowing just how many commits the upstream branch
advanced after the commit from which the PR branch branched off. So
let's just go with a full clone instead, and be safe rather than sorry
(if we have "too shallow" a situation, a commit range `@{u}..` may very
well include a shallow commit itself, and the output of `git show
--check <shallow>` is _not_ pretty).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:38:01 -07:00
cc00362125 ci(check-whitespace): stop requiring a read/write token
As part of some recent security tightening, GitHub introduced the
ability to configure GitHub workflows to be run with a read-only token.
This is much more secure, in particular when working in a public
repository: While the regular read/write token might be restricted to
writing to the current branch, it is not necessarily restricted to
access only the current Pull Request.

However, the `check-whitespace` workflow threw a wrench into this plan:
it _requires_ write access (because it wants to add a PR comment in case
of a whitespace issue).

Let's just skip that PR comment. The user can always click through to
the actual error, even if it is slightly less convenient.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:37:59 -07:00
1ba5f45132 checkout: stop expanding sparse indexes
Previous changes did the necessary improvements to unpack-trees.c and
diff-lib.c in order to modify a sparse index based on its comparision
with a tree. The only remaining work is to remove some
ensure_full_index() calls and add tests that verify that the index is
not expanded in our interesting cases. Include 'switch' and 'restore' in
these tests, as they share a base implementation with 'checkout'.

Here are the relevant performance results from
p2000-sparse-operations.sh:

Test                                     HEAD~1           HEAD
--------------------------------------------------------------------------------
2000.18: git checkout -f - (full-v3)     0.49(0.43+0.03)  0.47(0.39+0.05) -4.1%
2000.19: git checkout -f - (full-v4)     0.45(0.37+0.06)  0.42(0.37+0.05) -6.7%
2000.20: git checkout -f - (sparse-v3)   0.76(0.71+0.07)  0.04(0.03+0.04) -94.7%
2000.21: git checkout -f - (sparse-v4)   0.75(0.72+0.04)  0.05(0.06+0.04) -93.3%

It is important to compare the full index case to the sparse index case,
as the previous results for the sparse index were inflated by the index
expansion. For index v4, this is an 88% improvement.

On an internal repository with over two million paths at HEAD and a
sparse-checkout definition containing ~60,000 of those paths, 'git
checkout' went from 3.5s to 297ms with this change. The theoretical
optimum where only those ~60,000 paths exist was 275ms, so the extra
sparse directory entries contribute a 22ms overhead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:05:53 -07:00
f934f1b47f sparse-index: recompute cache-tree
When some commands run with command_requires_full_index=1, then the
index can get in a state where the in-memory cache tree is actually
equal to the sparse index's cache tree instead of the full one.

This results in incorrect entry_count values. By clearing the cache
tree before converting to sparse, we avoid this issue.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:05:53 -07:00
daa1acefc5 commit: integrate with sparse-index
Update 'git commit' to allow using the sparse-index in memory without
expanding to a full one. The only place that had an ensure_full_index()
call was in cache_tree_update(). The recursive algorithm for
update_one() was already updated in 2de37c536 (cache-tree: integrate
with sparse directory entries, 2021-03-03) to handle sparse directory
entries in the index.

Most of this change involves testing different command-line options that
allow specifying which on-disk changes should be included in the commit.
This includes no options (only take currently-staged changes), -a (take
all tracked changes), and --include (take a list of specific changes).
To simplify testing that these options do not expand the index, update
the test that previously verified that 'git status' does not expand the
index with a helper method, ensure_not_expanded().

This allows 'git commit' to operate much faster when the sparse-checkout
cone is much smaller than the full list of files at HEAD.

Here are the relevant lines from p2000-sparse-operations.sh:

Test                                      HEAD~1           HEAD
----------------------------------------------------------------------------------
2000.14: git commit -a -m A (full-v3)     0.35(0.26+0.06)  0.36(0.28+0.07) +2.9%
2000.15: git commit -a -m A (full-v4)     0.32(0.26+0.05)  0.34(0.28+0.06) +6.3%
2000.16: git commit -a -m A (sparse-v3)   0.63(0.59+0.06)  0.04(0.05+0.05) -93.7%
2000.17: git commit -a -m A (sparse-v4)   0.64(0.59+0.08)  0.04(0.04+0.04) -93.8%

It is important to compare the full-index case to the sparse-index case,
so the improvement for index version v4 is actually an 88% improvement in
this synthetic example.

In a real repository with over two million files at HEAD and 60,000
files in the sparse-checkout definition, the time for 'git commit -a'
went from 2.61 seconds to 134ms. I compared this to the result if the
index only contained the paths in the sparse-checkout definition and
found the theoretical optimum to be 120ms, so the out-of-cone paths only
add a 12% overhead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:05:53 -07:00
11042ab914 p2000: compress repo names
By using shorter names for the test repos, we will get a slightly more
compressed performance summary without comprimising clarity.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:05:53 -07:00
0d53d19946 p2000: add 'git checkout -' test and decrease depth
As we increase our list of commands to test in
p2000-sparse-operations.sh, we will want to have a slightly smaller test
repository. Reduce the size by a factor of four by reducing the depth of
the step that creates a big index around a moderately-sized repository.

Also add a step to run 'git checkout -' on repeat. This requires having
a previous location in the reflog, so add that to the initialization
steps.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:05:53 -07:00
e5ca291076 t1092: document bad sparse-checkout behavior
There are several situations where a repository with sparse-checkout
enabled will act differently than a normal repository, and in ways that
are not intentional. The test t1092-sparse-checkout-compatibility.sh
documents some of these deviations, but a casual reader might think
these are intentional behavior changes.

Add comments on these tests that make it clear that these behaviors
should be updated. Using 'NEEDSWORK' helps contributors find that these
are potential areas for improvement.

Helped-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
f8fe49e539 fsmonitor: integrate with sparse index
If we need to expand a sparse-index into a full one, then the FS Monitor
bitmap is going to be incorrect. Ensure that we start fresh at such an
event.

While this is currently a performance drawback, the eventual hope of the
sparse-index feature is that these expansions will be rare and hence we
will be able to keep the FS Monitor data accurate across multiple Git
commands.

These tests are added to demonstrate that the behavior is the same
across a full index and a sparse index, but also that file modifications
to a tracked directory outside of the sparse cone will trigger
ensure_full_index().

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
fe0d576153 wt-status: expand added sparse directory entries
It is difficult, but possible, to get into a state where we intend to
add a directory that is outside of the sparse-checkout definition. Add a
test to t1092-sparse-checkout-compatibility.sh that demonstrates this
using a combination of 'git reset --mixed' and 'git checkout --orphan'.

This test failed before because the output of 'git status
--porcelain=v2' would not match on the lines for folder1/:

* The sparse-checkout repo (with a full index) would output each path
  name that is intended to be added.

* The sparse-index repo would only output that "folder1/" is staged for
  addition.

The status should report the full list of files to be added, and so this
sparse-directory entry should be expanded to a full list when reaching
it inside the wt_status_collect_changes_initial() method. Use
read_tree_at() to assist.

Somehow, this loop over the cache entries was not guarded by
ensure_full_index() as intended.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
d76723ee53 status: use sparse-index throughout
By testing 'git -c core.fsmonitor= status -uno', we can check for the
simplest index operations that can be made sparse-aware. The necessary
implementation details are already integrated with sparse-checkout, so
modify command_requires_full_index to be zero for cmd_status().

In refresh_index(), we loop through the index entries to refresh their
stat() information. However, sparse directories have no stat()
information to populate. Ignore these entries.

This allows 'git status' to no longer expand a sparse index to a full
one. This is further tested by dropping the "-uno" option and adding an
untracked file into the worktree.

The performance test p2000-sparse-checkout-operations.sh demonstrates
these improvements:

Test                                  HEAD~1           HEAD
-----------------------------------------------------------------------------
2000.2: git status (full-index-v3)    0.31(0.30+0.05)  0.31(0.29+0.06) +0.0%
2000.3: git status (full-index-v4)    0.31(0.29+0.07)  0.34(0.30+0.08) +9.7%
2000.4: git status (sparse-index-v3)  2.35(2.28+0.10)  0.04(0.04+0.05) -98.3%
2000.5: git status (sparse-index-v4)  2.35(2.24+0.15)  0.05(0.04+0.06) -97.9%

Note that since HEAD~1 was expanding the sparse index by parsing trees,
it was artificially slower than the full index case. Thus, the 98%
improvement is misleading, and instead we should celebrate the 0.34s to
0.05s improvement of 85%. This is more indicative of the peformance
gains we are expecting by using a sparse index.

Note: we are dropping the assignment of core.fsmonitor here. This is not
necessary for the test script as we are not altering the config any
other way. Correct integration with FS Monitor will be validated in
later changes.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
bf48e5acdb status: skip sparse-checkout percentage with sparse-index
'git status' began reporting a percentage of populated paths when
sparse-checkout is enabled in 051df3cf (wt-status: show sparse
checkout status as well, 2020-07-18). This percentage is incorrect when
the index has sparse directories. It would also be expensive to
calculate as we would need to parse trees to count the total number of
possible paths.

Avoid the expensive computation by simplifying the output to only report
that a sparse checkout exists, without the percentage.

This change is the reason we use 'git status --porcelain=v2' in
t1092-sparse-checkout-compatibility.sh. We don't want to ensure that
this message is equal across both modes, but instead just the important
information about staged, modified, and untracked files are compared.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
9eb00af562 diff-lib: handle index diffs with sparse dirs
While comparing an index to a tree, we may see a sparse directory entry.
In this case, we should compare that portion of the tree to the tree
represented by that entry. This could include a new tree which needs to
be expanded to a full list of added files. It could also include an
existing tree, in which case all of the changes inside are important to
describe, including the modifications, additions, and deletions. Note
that the case where the tree has a path and the index does not remains
identical to before: the lack of a cache entry is the same with a sparse
index.

Use diff_tree_oid() appropriately to compute the diff.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
69bdbdb0ee dir.c: accept a directory as part of cone-mode patterns
When we have sparse directory entries in the index, we want to compare
that directory against sparse-checkout patterns. Those pattern matching
algorithms are built expecting a file path, not a directory path. This
is especially important in the "cone mode" patterns which will match
files that exist within the "parent directories" as well as the
recursive directory matches.

If path_matches_pattern_list() is given a directory, we can add a fake
filename ("-") to the directory and get the same results as before,
assuming we are in cone mode. Since sparse index requires cone mode
patterns, this is an acceptable assumption.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
523506df51 unpack-trees: unpack sparse directory entries
During unpack_callback(), index entries are compared against tree
entries. These are matched according to names and types. One goal is to
decide if we should recurse into subtrees or simply operate on one index
entry.

In the case of a sparse-directory entry, we do not want to recurse into
that subtree and instead simply compare the trees. In some cases, we
might want to perform a merge operation on the entry, such as during
'git checkout <commit>' which wants to replace a sparse tree entry with
the tree for that path at the target commit. We extend the logic within
unpack_single_entry() to create a sparse-directory entry in this case,
and then that is sent to call_unpack_fn().

There are some subtleties in this process. For instance, we need to
update find_cache_entry() to allow finding a sparse-directory entry that
exactly matches a given path. Use the new helper method
sparse_dir_matches_path() for this. We also need to ignore conflict
markers in the case that the entries correspond to directories and we
already have a sparse directory entry.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
bd6a3fd7f1 unpack-trees: rename unpack_nondirectories()
In the next change, we will use this method to unpack a sparse directory
entry, so change the name to unpack_single_entry() so these entries
apply. The new name reflects that we will not recurse into trees in
order to resolve the conflicts.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
cd807a5cda unpack-trees: compare sparse directories correctly
As we further integrate the sparse-index into unpack-trees, we need to
ensure that we compare sparse directory entries correctly with other
entries. This affects searching for an exact path as well as sorting
index entries.

Sparse directory entries contain the trailing directory separator. This
is important for the sorting, in particular. Thus, within
do_compare_entry() we stop using S_IFREG in all cases, since sparse
directories should use S_IFDIR to indicate that the comparison should
treat the entry name as a dirctory.

Within compare_entry(), it first calls do_compare_entry() to check the
leading portion of the name. When the input path is a directory name, we
could match exactly already. Thus, we should return 0 if we have an
exact string match on a sparse directory entry. The final check is a
length comparison between the strings.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:48 -07:00
17a1bb570b unpack-trees: preserve cache_bottom
The cache_bottom member of 'struct unpack_trees_options' is used to
track the range of index entries corresponding to a node of the cache
tree. While recursing with traverse_by_cache_tree(), this value is
preserved on the call stack using a local and then restored as that
method returns.

The mark_ce_used() method normally modifies the cache_bottom member when
it refers to the marked cache entry. However, sparse directory entries
are stored as nodes in the cache-tree data structure as of 2de37c53
(cache-tree: integrate with sparse directory entries, 2021-03-30). Thus,
the cache_bottom will be modified as the cache-tree walk advances. Do
not update it as well within mark_ce_used().

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:48 -07:00
bf26c06f12 t1092: add tests for status/add and sparse files
Before moving to update 'git status' and 'git add' to work with sparse
indexes, add an explicit test that ensures the sparse-index works the
same as a normal sparse-checkout when the worktree contains directories
and files outside of the sparse cone.

Specifically, 'folder1/a' is a file in our test repo, but 'folder1' is
not in the sparse cone. When 'folder1/a' is modified, the file is not
shown as modified and adding it will fail. This is new behavior as of
a20f704 (add: warn when asked to update SKIP_WORKTREE entries,
2021-04-08). Before that change, these adds would be silently ignored.

Untracked files are fine: adding new files both with 'git add .' and
'git add folder1/' works just as in a full checkout. This may not be
entirely desirable, but we are not intending to change behavior at the
moment, only document it. A future change could alter the behavior to
be more sensible, and this test could be modified to satisfy the new
expected behavior.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:48 -07:00
e669ffb2b8 t1092: expand repository data shape
As more features integrate with the sparse-index feature, more and more
special cases arise that require different data shapes within the tree
structure of the repository in order to demonstrate those cases.

Add several interesting special cases all at once instead of sprinkling
them across several commits. The interesting cases being added here are:

* Add sparse-directory entries on both sides of directories within the
  sparse-checkout definition.

* Add directories outside the sparse-checkout definition who have only
  one entry and are the first entry of a directory with multiple
  entries.

* Add filenames adjacent to a sparse directory entry that sort before
  and after the trailing slash.

Later tests will take advantage of these shapes, but they also deepen
the tests that already exist.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:48 -07:00
3d814b5dc0 t1092: replace incorrect 'echo' with 'cat'
This fixes the test data shape to be as expected, allowing rename
detection to work properly now that the 'larger-content' file actually
has meaningful lines.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:21 -07:00
47410778fb sparse-index: include EXTENDED flag when expanding
When creating a full index from a sparse one, we create cache entries
for every blob within a given sparse directory entry. These are
correctly marked with the CE_SKIP_WORKTREE flag, but the CE_EXTENDED
flag is not included. The CE_EXTENDED flag would exist if we loaded a
full index from disk with these entries marked with CE_SKIP_WORKTREE, so
we can add the flag here to be consistent. This allows us to directly
compare the flags present in cache entries when testing the sparse-index
feature, but has no significance to its correctness in the user-facing
functionality.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:21 -07:00
fc6609d198 sparse-index: skip indexes with unmerged entries
The sparse-index format is designed to be compatible with merge
conflicts, even those outside the sparse-checkout definition. The reason
is that when converting a full index to a sparse one, a cache entry with
nonzero stage will not be collapsed into a sparse directory entry.

However, this behavior was not tested, and a different behavior within
convert_to_sparse() fails in this scenario. Specifically,
cache_tree_update() will fail when unmerged entries exist.
convert_to_sparse_rec() uses the cache-tree data to recursively walk the
tree structure, but also to compute the OIDs used in the
sparse-directory entries.

Add an index scan to convert_to_sparse() that will detect if these merge
conflict entries exist and skip the conversion before trying to update
the cache-tree. This is marked as NEEDSWORK because this can be removed
with a suitable update to cache_tree_update() or a similar method that
can construct a cache-tree with invalid nodes, but still allow creating
the nodes necessary for creating sparse directory entries.

It is possible that in the future we will not need to make such an
update, since if we do not expand a sparse-index into a full one, this
conversion does not need to happen. Thus, this can be deferred until the
merge machinery is made to integrate with the sparse-index.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:21 -07:00
e61059660c ci: run make sparse as part of the GitHub workflow
Occasionally we receive reviews after patches were integrated, where
`sparse` (https://sparse.docs.kernel.org/en/latest/ has more information
on that project) identified problems such as file-local variables or
functions being declared as global.

By running `sparse` as part of our Continuous Integration, we can catch
such things much earlier. Even better: developers who activated GitHub
Actions on their forks can catch such issues before even sending their
patches to the Git mailing list.

This addresses https://github.com/gitgitgadget/git/issues/345

Note: Not even Ubuntu 20.04 ships with a new enough version of `sparse`
to accommodate Git's needs. The symptom looks like this:

    add-interactive.c:537:51: error: Using plain integer as NULL pointer

To counter that, we download and install the custom-built `sparse`
package from the Azure Pipeline that we specifically created to address
this issue.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 10:14:21 -07:00
d1ed8d6cee load_ref_decorations(): fix decoration with tags
Commit 88473c8bae ("load_ref_decorations(): avoid parsing non-tag
objects", 2021-06-22) introduced a shortcut to `add_ref_decoration()`:
Rather than calling `parse_object()`, we go for `oid_object_info()` and
then `lookup_object_by_type()` using the type just discovered. As
detailed in the commit message, this provides a significant time saving.

Unfortunately, it also changes the behavior: We lose all annotated tags
from the decoration.

The reason this happens is in the loop where we try to peel the tags, we
won't necessarily have parsed that first object. If we haven't, its
`tagged` field will be NULL, so we won't actually add a decoration for
the pointed-to object.

Make sure to parse the tag object at the top of the peeling loop. This
effectively restores the pre-88473c8bae parsing -- but only of tags,
allowing us to keep most of the possible speedup from 88473c8bae.

On my big ~220k ref test case (where it's mostly non-tags), the
timings [using "git log -1 --decorate"] are:

  - before either patch: 2.945s
  - with my broken patch: 0.707s
  - with [this patch]: 0.788s

The simplest way to do this is to just conditionally parse before the
loop:

  if (obj->type == OBJ_TAG)
          parse_object(&obj->oid);

But we can observe that our tag-peeling loop needs to peel already, to
examine recursive tags-of-tags. So instead of introducing a new call to
parse_object(), we can simply move the parsing higher in the loop:
instead of parsing the new object before we loop, parse each tag object
before we look at its "tagged" field.

This has another beneficial side effect: if a tag points at a commit (or
other non-tag type), we do not bother to parse the commit at all now.
And we know it is a commit without calling oid_object_info(), because
parsing the surrounding tag object will have created the correct in-core
object based on the "type" field of the tag.

Our test coverage for --decorate was obviously not good, since we missed
this quite-basic regression. The new tests covers an annotated tag
(showing the fix), but also that we correctly show annotations for
lightweight tags and double-annotated tag-of-tags.

Reported-by: Martin Ågren <martin.agren@gmail.com>
Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 10:11:02 -07:00
f7c35ea2a1 worktree: mark lock strings with _() for translation
- default lock string, "added with --lock"
- temporary lock string, "initializing"

Signed-off-by: Stephen Manz <smanz@alum.mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 09:29:59 -07:00
f9365c0a24 t2400: clean up '"add" worktree with lock' test
- remove unneeded `git rev-parse` which must have come from a copy-paste
  of another test
- unlock the worktree with test_when_finished

Signed-off-by: Stephen Manz <smanz@alum.mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 09:29:36 -07:00
75ae10bc75 The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13 16:52:53 -07:00
1157618a2a Merge branch 'rs/grep-parser-fix'
"git grep --and -e foo" ought to have been diagnosed as an error
but instead segfaulted, which has been corrected.

* rs/grep-parser-fix:
  grep: report missing left operand of --and
2021-07-13 16:52:53 -07:00
21ef7ee4d6 Merge branch 'bk/doc-commit-typofix'
Doc typo/grammo-fix.

* bk/doc-commit-typofix:
  Documentation: fix typo in the --patch option of the commit command
2021-07-13 16:52:52 -07:00
b6bd704c3e Merge branch 'dc/p4-binary-submit-fix'
Prevent "git p4" from failing to submit changes to binary file.

* dc/p4-binary-submit-fix:
  git-p4: fix failed submit by skip non-text data files
2021-07-13 16:52:52 -07:00
d2992c6bc0 Merge branch 'ab/pre-auto-gc-hook-test'
Test fix.

* ab/pre-auto-gc-hook-test:
  gc tests: add a test for the "pre-auto-gc" hook
2021-07-13 16:52:52 -07:00
308528a3ea Merge branch 'jk/union-merge-binary'
The "union" conflict resolution variant misbehaved when used with
binary merge driver.

* jk/union-merge-binary:
  ll_union_merge(): rename path_unused parameter
  ll_union_merge(): pass name labels to ll_xdl_merge()
  ll_binary_merge(): handle XDL_MERGE_FAVOR_UNION
2021-07-13 16:52:51 -07:00
c3c0b71f9a Merge branch 'mr/cmake'
CMake update.

* mr/cmake:
  cmake: add warning for ignored MSGFMT_EXE
  cmake: create compile_commands.json by default
  cmake: add knob to disable vcpkg
2021-07-13 16:52:51 -07:00
4d4c7d05da Merge branch 'ab/describe-tests-fix'
Various updates to tests around "git describe"

* ab/describe-tests-fix:
  describe tests: support -C in "check_describe"
  describe tests: fix nested "test_expect_success" call
  describe tests: don't rely on err.actual from "check_describe"
  describe tests: refactor away from glob matching
  describe tests: improve test for --work-tree & --dirty
2021-07-13 16:52:51 -07:00
4da281e84d Merge branch 'ab/pickaxe-pcre2'
Rewrite the backend for "diff -G/-S" to use pcre2 engine when
available.

* ab/pickaxe-pcre2: (22 commits)
  xdiff-interface: replace discard_hunk_line() with a flag
  xdiff users: use designated initializers for out_line
  pickaxe -G: don't special-case create/delete
  pickaxe -G: terminate early on matching lines
  xdiff-interface: allow early return from xdiff_emit_line_fn
  xdiff-interface: prepare for allowing early return
  pickaxe -S: slightly optimize contains()
  pickaxe: rename variables in has_changes() for brevity
  pickaxe -S: support content with NULs under --pickaxe-regex
  pickaxe: assert that we must have a needle under -G or -S
  pickaxe: refactor function selection in diffcore-pickaxe()
  perf: add performance test for pickaxe
  pickaxe/style: consolidate declarations and assignments
  diff.h: move pickaxe fields together again
  pickaxe: die when --find-object and --pickaxe-all are combined
  pickaxe: die when -G and --pickaxe-regex are combined
  pickaxe tests: add missing test for --no-pickaxe-regex being an error
  pickaxe tests: test for -G, -S and --find-object incompatibility
  pickaxe tests: add test for "log -S" not being a regex
  pickaxe tests: add test for diffgrep_consume() internals
  ...
2021-07-13 16:52:50 -07:00
c9780bb2ca Merge branch 'hn/prep-tests-for-reftable'
Preliminary clean-up of tests before the main reftable changes
hits the codebase.

* hn/prep-tests-for-reftable: (22 commits)
  t1415: set REFFILES for test specific to storage format
  t4202: mark bogus head hash test with REFFILES
  t7003: check reflog existence only for REFFILES
  t7900: stop checking for loose refs
  t1404: mark tests that muck with .git directly as REFFILES.
  t2017: mark --orphan/logAllRefUpdates=false test as REFFILES
  t1414: mark corruption test with REFFILES
  t1407: require REFFILES for for_each_reflog test
  test-lib: provide test prereq REFFILES
  t5304: use "reflog expire --all" to clear the reflog
  t5304: restyle: trim empty lines, drop ':' before >
  t7003: use rev-parse rather than FS inspection
  t5000: inspect HEAD using git-rev-parse
  t5000: reformat indentation to the latest fashion
  t1301: fix typo in error message
  t1413: use tar to save and restore entire .git directory
  t1401-symbolic-ref: avoid direct filesystem access
  t1401: use tar to snapshot and restore repo state
  t5601: read HEAD using rev-parse
  t9300: check ref existence using test-helper rather than a file system check
  ...
2021-07-13 16:52:50 -07:00
0659866a09 Merge branch 'fc/push-simple-updates-cleanup'
Some more code and doc clarification around "git push".

* fc/push-simple-updates-cleanup:
  push: don't get a full remote object
  push: only check same_remote when needed
  push: remove trivial function
  push: remove redundant check
  push: factor out the typical case
  push: get rid of all the setup_push_* functions
  push: trivial simplifications
  push: make setup_push_* return the dst
  push: only get the branch when needed
  push: factor out null branch check
  push: split switch cases
  push: return immediately in trivial switch case
  push: create new get_upstream_ref() helper
2021-07-13 16:52:50 -07:00
07e230d762 Merge branch 'fc/push-simple-updates'
Some code and doc clarification around "git push".

* fc/push-simple-updates:
  doc: push: explain default=simple correctly
  push: remove unused code in setup_push_upstream()
  push: simplify setup_push_simple()
  push: reorganize setup_push_simple()
  push: copy code to setup_push_simple()
  push: hedge code of default=simple
  push: rename !triangular to same_remote
2021-07-13 16:52:49 -07:00
5d96bcbc06 Merge branch 'zh/cat-file-batch-fix'
"git cat-file --batch-all-objects"" misbehaved when "--batch" is in
use and did not ask for certain object traits.

* zh/cat-file-batch-fix:
  cat-file: merge two block into one
  cat-file: handle trivial --batch format with --batch-all-objects
2021-07-13 16:52:49 -07:00
b1d87fbaf1 doc/rev-list-options: fix duplicate word typo
Reported-by: Jason Hatton <jhatton@globalfinishing.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13 16:19:46 -07:00
927dc33070 advice.h: add missing __attribute__((format)) & fix usage
Add the missing __attribute__((format)) checking to
advise_if_enabled(). This revealed a trivial issue introduced in
b3b18d1621 (advice: revamp advise API, 2020-03-02). We treated the
argv[1] as a format string, but did not intend to do so. Let's use
"%s" and pass argv[1] as an argument instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13 15:20:20 -07:00
75d31ceec5 *.h: add a few missing __attribute__((format))
Add missing format attributes to API functions that take printf
arguments.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13 15:20:20 -07:00
48ca53cac4 *.c static functions: add missing __attribute__((format))
Add missing __attribute__((format)) function attributes to various
"static" functions that take printf arguments.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13 15:20:20 -07:00
eb448631fb git-diff: fix missing --merge-base docs
When `git diff --merge-base` was introduced at around Git 2.30, the
documentation included a few errors.

In the example given for `git diff --cached --merge-base`, the
`--cached` flag was omitted for the `--merge-base` example. Add the
missing flag.

In the `git diff <commit>` case, we failed to mention that
`--merge-base` is an available option. Give the usage of `--merge-base`
as an option there.

Finally, there are two errors in the usage of `git diff`. Firstly, we do
not mention `--merge-base` in the `git diff --cached` case. Mention it
so that it's consistent with the documentation. Secondly, we put the
`[--merge-base]` in between `<commit>` and `[<commit>...]`. Move the
`[--merge-base]` so that it's beside `[<options>]` which is a more
logical grouping.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 13:55:29 -07:00
d4ac305073 sequencer.c: move static function to avoid forward decl
Move the reflog_message() function added in
96e832a5fd (sequencer (rebase -i): refactor setting the reflog
message, 2017-01-02), it gained another user in
9055e401dd (sequencer: introduce new commands to reset the revision,
2018-04-25). Let's move it around and remove the forward declaration
added in the latter commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 12:09:53 -07:00
103e02c700 *.c static functions: don't forward-declare __attribute__
9cf6d3357a (Add git-index-pack utility, 2005-10-12) and
466dbc42f5 (receive-pack: Send internal errors over side-band #2,
2010-02-10) we added these static functions and forward-declared their
__attribute__((printf)).

I think this may have been to work around some compiler limitation at
the time, but in any case we have a lot of code that uses the briefer
way of declaring these that I'm using here, so if we had any such
issues with compilers we'd have seen them already.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 12:09:53 -07:00
8c8195e9c3 submodule--helper: introduce add-clone subcommand
Let's add a new "add-clone" subcommand to `git submodule--helper` with
the goal of converting part of the shell code in git-submodule.sh
related to `git submodule add` into C code. This new subcommand clones
the repository that is to be added, and checks out to the appropriate
branch.

This is meant to be a faithful conversion that leaves the behaviour of
'cmd_add()' script unchanged.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 12:06:21 -07:00
a98b02c112 submodule--helper: refactor module_clone()
Separate out the core logic of module_clone() from the flag
parsing---this way we can call the equivalent of the `submodule--helper
clone` subcommand directly within C, without needing to push arguments
in a strvec.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 12:06:21 -07:00
0008d12284 submodule: prefix die messages with 'fatal'
The standard `die()` function that is used in C code prefixes all the
messages passed to it with 'fatal: '. This does not happen with the
`die` used in 'git-submodule.sh'.

Let's prefix each of the shell die messages with 'fatal: ' so that when
they are converted to C code, the error messages stay the same as before
the conversion.

Note that the shell version of `die` exits with error code 1, while the
C version exits with error code 128. In practice, this does not change
any behaviour, as no functionality in 'submodule add' and 'submodule
update' relies on the value of the exit code.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 12:06:21 -07:00
d1c5ae78ce rev-list: add option for --pretty=format without header
In general, we encourage users to use plumbing commands, like git
rev-list, over porcelain commands, like git log, when scripting.
However, git rev-list has one glaring problem that prevents it from
being used in certain cases: when --pretty is used with a custom format,
it always prints out a line containing "commit" and the object ID.  This
makes it unsuitable for many scripting needs, and forces users to use
git log instead.

While we can't change this behavior for backwards compatibility, we can
add an option to suppress this behavior, so let's do so, and call it
"--no-commit-header".  Additionally, add the corresponding positive
option to switch it back on.

Note that this option doesn't affect the built-in formats, only custom
formats.  This is exactly the same behavior as users already have from
git log and is what most users will be used to.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 10:12:31 -07:00
6f70f00b4f commit: remove irrelavent prompt on --allow-empty-message
Even when the `--allow-empty-message` option is given, "git commit"
offers an interactive editor session with prefilled message that says
the commit will be aborted if the buffer is emptied, which is wrong.

Remove the "an empty message aborts" part from the message when the
option is given to fix it.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Hu Jialun <hujialun@comp.nus.edu.sg>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-09 12:08:18 -07:00
54ba2f1862 commit: reorganise commit hint strings
Strings of hint messages inserted into editor on interactive commit was
scattered in-line, rendering the code harder to understand at first
glance.

Extract those messages out into separate variables to make the code
outline easier to follow.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Hu Jialun <hujialun@comp.nus.edu.sg>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-09 12:04:51 -07:00
561fa03529 pack-objects: fix segfault in --stdin-packs option
Fix a segfault in the --stdin-packs option added in
339bce27f4 (builtin/pack-objects.c: add '--stdin-packs' option,
2021-02-22).

The read_packs_list_from_stdin() function didn't check that the lines
it was reading were valid packs, and thus when doing the QSORT() with
pack_mtime_cmp() we'd have a NULL "util" field. The "util" field is
used to associate the names of included/excluded packs with the
packed_git structs they correspond to.

The logic error was in assuming that we could iterate all packs and
annotate the excluded and included packs we got, as opposed to
checking the lines we got on stdin. There was a check for excluded
packs, but included packs were simply assumed to be valid.

As noted in the test we'll not report the first bad line, but whatever
line sorted first according to the string-list.c API. In this case I
think that's fine. There was further discussion of alternate
approaches in [1].

Even though we're being lazy let's assert the line we do expect to get
in the test, since whoever changes this code in the future might miss
this case, and would want to update the test and comments.

1. http://lore.kernel.org/git/YND3h2l10PlnSNGJ@nand.local

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-09 11:53:40 -07:00
8232a0ff48 pkt-line: replace "stateless separator" with "response end"
In 0181b600a6 (pkt-line: define PACKET_READ_RESPONSE_END, 2020-05-19),
the Response End packet was defined for Git's network protocol. When the
patch was sent, it included an oversight where the error messages
referenced "stateless separator", the work-in-progress name, over
"response end", the final name chosen.

Correct these error messages by having them correctly reference
a "response end" packet.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-09 11:51:17 -07:00
ae4e099e7c l10n: fixed typos of mismatched constant strings
Andrei pointed out a typo in the Swedish translation, where a config
variable name had been copied incorrectly.

By introducing typo detection function in "git-po-helper", more typos
were found. All easy-to-fix typos were fixed in this commit.

Reported-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-07-09 08:16:12 +08:00
5dbeffdeaf Merge branch 'master' of github.com:Softcatala/git-po
* 'master' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2021-07-09 08:12:20 +08:00
d486ca60a5 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-08 13:15:06 -07:00
32d6280226 Merge branch 'js/stop-exporting-bogus-columns'
When we cannot figure out how wide the terminal is, we use a
fallback value of 80 ourselves (which cannot be avoided), but when
we run the pager, we export it in COLUMNS, which forces the pager
to use the hardcoded value, even when the pager is perfectly
capable to figure it out itself.  Stop exporting COLUMNS when we
fall back on the hardcoded default value for our own use.

* js/stop-exporting-bogus-columns:
  pager: avoid setting COLUMNS when we're guessing its value
2021-07-08 13:15:06 -07:00
5ae1eb22c9 Merge branch 'dd/document-log-decorate-default'
Doc clean-up.

* dd/document-log-decorate-default:
  doc/log: correct default for --decorate
2021-07-08 13:15:05 -07:00
693575c2d1 Merge branch 'ar/test-code-cleanup'
Test code clean-up.

* ar/test-code-cleanup:
  t: fix whitespace around &&
2021-07-08 13:15:05 -07:00
069f9989cb Merge branch 'ba/object-info'
Code clean-up.

* ba/object-info:
  protocol-caps.h: add newline at end of file
2021-07-08 13:15:05 -07:00
2134c3fb1f Merge branch 'ab/progress-cleanup'
Code clean-up.

* ab/progress-cleanup:
  read-cache.c: don't guard calls to progress.c API
2021-07-08 13:15:05 -07:00
8e48f115aa Merge branch 'ab/xdiff-bug-cleanup'
Code clean-up.

* ab/xdiff-bug-cleanup:
  xdiff: use BUG(...), not xdl_bug(...)
2021-07-08 13:15:04 -07:00
b7bd70ddd4 Merge branch 'ms/mergetools-kdiff3-on-windows'
On Windows, mergetool has been taught to find kdiff3.exe just like
it finds winmerge.exe.

* ms/mergetools-kdiff3-on-windows:
  mergetools/kdiff3: make kdiff3 work on Windows too
2021-07-08 13:15:04 -07:00
e867110340 Merge branch 'ab/cmd-foo-should-return'
Code clean-up.

* ab/cmd-foo-should-return:
  builtins + test helpers: use return instead of exit() in cmd_*
2021-07-08 13:15:04 -07:00
11819cb741 Merge branch 'ar/doc-libera-chat-in-my-first-contrib'
Update MyFirst document.

* ar/doc-libera-chat-in-my-first-contrib:
  MyFirstContribution: link #git-devel to Libera Chat
2021-07-08 13:15:03 -07:00
c08e112b43 Merge branch 'ar/mailinfo-memcmp-to-skip-prefix'
Code clean-up.

* ar/mailinfo-memcmp-to-skip-prefix:
  mailinfo: use starts_with() when checking scissors
2021-07-08 13:15:03 -07:00
18b49be492 Merge branch 'jk/doc-max-pack-size'
Doc update.

* jk/doc-max-pack-size:
  doc: warn people against --max-pack-size
2021-07-08 13:15:03 -07:00
83ae1edff7 Merge branch 'ab/fix-columns-to-80-during-tests'
Output from some of our tests were affected by the width of the
terminal that they were run in, which has been corrected by
exporting a fixed value in the COLUMNS environment.

* ab/fix-columns-to-80-during-tests:
  test-lib.sh: set COLUMNS=80 for --verbose repeatability
2021-07-08 13:15:03 -07:00
018b85dead Merge branch 'ar/more-typofix'
Typofixes.

* ar/more-typofix:
  git-worktree.txt: fix typo in example path
  t: fix typos in test messages
  blame: correct name of config option in docs
2021-07-08 13:15:02 -07:00
a38c4c02e5 Merge branch 'fw/complete-cmd-idx-fix'
Recent update to completion script (in contrib/) broke those who
use the __git_complete helper to define completion to their custom
command.

* fw/complete-cmd-idx-fix:
  completion: bash: fix late declaration of __git_cmd_idx
2021-07-08 13:15:02 -07:00
62473695d2 Merge branch 'jk/test-without-readlink-1'
Some test scripts assumed that readlink(1) was universally
installed and available, which is not the case.

* jk/test-without-readlink-1:
  t: use portable wrapper for readlink(1)
2021-07-08 13:15:02 -07:00
7cc1147371 Merge branch 'jx/sideband-cleanup'
The side-band demultiplexer that is used to display progress output
from the remote end did not clear the line properly when the end of
line hits at a packet boundary, which has been corrected.  Also
comes with test clean-ups.

* jx/sideband-cleanup:
  test: refactor to use "get_abbrev_oid" to get abbrev oid
  test: refactor to use "test_commit" to create commits
  test: compare raw output, not mangle tabs and spaces
  sideband: don't lose clear-to-eol at packet boundary
2021-07-08 13:15:01 -07:00
905549ff4e Merge branch 'jk/test-avoid-globmatch-with-skip-patterns'
We broke "GIT_SKIP_TESTS=t?000" to skip certain tests in recent
update, which got fixed.

* jk/test-avoid-globmatch-with-skip-patterns:
  test-lib: avoid accidental globbing in match_pattern_list()
2021-07-08 13:15:01 -07:00
f741069550 Merge branch 'jv/userdiff-csharp-update'
The userdiff pattern for C# learned the token "record".

* jv/userdiff-csharp-update:
  userdiff: add support for C# record types
2021-07-08 13:15:01 -07:00
5be2a2c74b Merge branch 'ab/config-hooks-path-testfix'
Test fix.

* ab/config-hooks-path-testfix:
  pre-commit hook tests: don't leave "actual" nonexisting on failure
2021-07-08 13:15:01 -07:00
221ec24e9b Merge branch 'fc/pull-cleanups'
Code cleanup.

* fc/pull-cleanups:
  pull: trivial whitespace style fix
  pull: trivial cleanup
  pull: cleanup autostash check
2021-07-08 13:15:00 -07:00
1ef488eaaa Merge branch 'jk/bitmap-tree-optim'
Avoid duplicated work while building reachability bitmaps.

* jk/bitmap-tree-optim:
  bitmaps: don't recurse into trees already in the bitmap
2021-07-08 13:15:00 -07:00
9f8aa6089a Merge branch 'ah/graph-typofix'
Typofix in an error message.

* ah/graph-typofix:
  graph: improve grammar of "invalid color" error message
2021-07-08 13:15:00 -07:00
102969c422 Merge branch 'jx/t6020-with-older-bash'
Work around inefficient glob substitution in older versions of bash
by rewriting parts of a test.

* jx/t6020-with-older-bash:
  t6020: fix incompatible parameter expansion
2021-07-08 13:14:59 -07:00
a515f26eac Merge branch 'ar/typofix'
Typofixes.

* ar/typofix:
  *: fix typos which duplicate a word
2021-07-08 13:14:59 -07:00
4677587e57 Merge branch 'jk/revision-squelch-gcc-warning'
Warning fix.

* jk/revision-squelch-gcc-warning:
  add_pending_object_with_path(): work around "gcc -O3" complaint
2021-07-08 13:14:59 -07:00
1d38852b11 Merge branch 'ah/uninitialized-reads-fix'
Make the codebase MSAN clean.

* ah/uninitialized-reads-fix:
  builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints
  split-index: use oideq instead of memcmp to compare object_id's
  bulk-checkin: make buffer reuse more obvious and safer
2021-07-08 13:14:59 -07:00
7e24201365 Merge branch 'js/no-more-multimail'
Remove multimail from contrib/

* js/no-more-multimail:
  multimail: stop shipping a copy
2021-07-08 13:14:58 -07:00
e22ac8b126 Merge branch 'js/subtree-on-windows-fix'
Update "git subtree" to work better on Windows.

* js/subtree-on-windows-fix:
  subtree: fix assumption about the directory separator
  subtree: fix the GIT_EXEC_PATH sanity check to work on Windows
2021-07-08 13:14:58 -07:00
0800bedcc7 Merge branch 'dd/svn-test-wo-locale-a'
"git-svn" tests assumed that "locale -a", which is used to pick an
available UTF-8 locale, is available everywhere.  A knob has been
introduced to allow testers to specify a suitable locale to use.

* dd/svn-test-wo-locale-a:
  t: use user-specified utf-8 locale for testing svn
2021-07-08 13:14:58 -07:00
11fac260fe Merge branch 'fc/doc-default-to-upstream-config'
Doc clean-up.

* fc/doc-default-to-upstream-config:
  doc: merge: mention default of defaulttoupstream
2021-07-08 13:14:57 -07:00
9c7a1fc9b6 Merge branch 'js/trace2-discard-event-docfix'
Docfix.

* js/trace2-discard-event-docfix:
  docs: fix api-trace2 doc for "too_many_files" event
2021-07-08 13:14:57 -07:00
3a7d26bb4b Merge branch 'tb/complete-diff-anchored'
The command line completion (in contrib/) learned that "git diff"
takes the "--anchored" option.

* tb/complete-diff-anchored:
  completion: add --anchored to diff's options
2021-07-08 13:14:56 -07:00
40098093c6 Merge branch 'tk/partial-clone-repack-doc'
Docfix.

* tk/partial-clone-repack-doc:
  Remove warning that repack only works on non-promisor packfiles
2021-07-08 13:14:56 -07:00
eff40457a4 fetch: fix segfault in --negotiate-only without --negotiation-tip=*
The recent --negotiate-only option would segfault in the call to
oid_array_for_each() in negotiate_using_fetch() unless one or more
--negotiation-tip=* options were provided.

All of the other tests for the feature combine both, but nothing was
checking this assumption, let's do that and add a test for it. Fixes a
bug in 9c1e657a8f (fetch: teach independent negotiation (no
packfile), 2021-05-04).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-08 08:20:16 -07:00
e651013a8b l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-07-08 10:28:27 +02:00
92d8ed8ac1 oidtree: a crit-bit tree for odb_loose_cache
This saves 8K per `struct object_directory', meaning it saves
around 800MB in my case involving 100K alternates (half or more
of those alternates are unlikely to hold loose objects).

This is implemented in two parts: a generic, allocation-free
`cbtree' and the `oidtree' wrapper on top of it.  The latter
provides allocation using alloc_state as a memory pool to
improve locality and reduce free(3) overhead.

Unlike oid-array, the crit-bit tree does not require sorting.
Performance is bound by the key length, for oidtree that is
fixed at sizeof(struct object_id).  There's no need to have
256 oidtrees to mitigate the O(n log n) overhead like we did
with oid-array.

Being a prefix trie, it is natively suited for expanding short
object IDs via prefix-limited iteration in
`find_short_object_filename'.

On my busy workstation, p4205 performance seems to be roughly
unchanged (+/-8%).  Startup with 100K total alternates with no
loose objects seems around 10-20% faster on a hot cache.
(800MB in memory savings means more memory for the kernel FS
cache).

The generic cbtree implementation does impose some extra
overhead for oidtree in that it uses memcmp(3) on
"struct object_id" so it wastes cycles comparing 12 extra bytes
on SHA-1 repositories.  I've not yet explored reducing this
overhead, but I expect there are many places in our code base
where we'd want to investigate this.

More information on crit-bit trees: https://cr.yp.to/critbit.html

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-07 21:28:04 -07:00
90e07f0a34 oidcpy_with_padding: constify `src' arg
As with `oidcpy', the source struct will not be modified and
this will allow an upcoming const-correct caller to use it.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-07 21:28:02 -07:00
33f379eee6 make object_directory.loose_objects_subdir_seen a bitmap
There's no point in using 8 bits per-directory when 1 bit
will do.  This saves us 224 bytes per object directory, which
ends up being 22MB when dealing with 100K alternates.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-07 21:27:58 -07:00
407532f82d avoid strlen via strbuf_addstr in link_alt_odb_entry
We can save a few milliseconds (across 100K odbs) by using
strbuf_addbuf() instead of strbuf_addstr() by passing `entry' as
a strbuf pointer rather than a "const char *".

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-07 21:27:56 -07:00
cf2dc1c238 speed up alt_odb_usable() with many alternates
With many alternates, the duplicate check in alt_odb_usable()
wastes many cycles doing repeated fspathcmp() on every existing
alternate.  Use a khash to speed up lookups by odb->path.

Since the kh_put_* API uses the supplied key without
duplicating it, we also take advantage of it to replace both
xstrdup() and strbuf_release() in link_alt_odb_entry() with
strbuf_detach() to avoid the allocation and copy.

In a test repository with 50K alternates and each of those 50K
alternates having one alternate each (for a total of 100K total
alternates); this speeds up lookup of a non-existent blob from
over 16 minutes to roughly 2.7 seconds on my busy workstation.

Note: all underlying git object directories were small and
unpacked with only loose objects and no packs.  Having to load
packs increases times significantly.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-07 17:21:12 -07:00
351bca2d1f imap-send.c: use less verbose strbuf_fread() idiom
When looking for things that hardcoded a non-zero "hint" parameter to
strbuf_fread() I discovered that since f2561fda36 (Add git-imap-send,
derived from isync 1.0.1., 2006-03-10) we've been passing a hardcoded
4096 in imap-send.c to read stdin.

Since we're not doing anything unusual here let's use a less verbose
pattern used in a lot of other places (the hint of "0" will default to
8192). We don't need to take a FILE * here either, so we can use "0"
instead of "stdin". While we're at it improve the error message if we
can't read the input to use error_errno().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-07 14:29:12 -07:00
84069fcc14 t7400: test failure to add submodule in tracked path
Add a test to ensure failure on adding a submodule to a directory with
tracked contents in the index.

As we are going to refactor and port to C some parts of `git submodule
add`, let's add a test to help ensure no regression is introduced.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-07 11:00:01 -07:00
e04170697a docs: .gitignore parsing is to the top of the repo
The current documentation reads as if .gitignore files will be parsed in
every parent directory, and not until they reach a repository boundary.
This clarifies the current behaviour.

As well, this corrects 'toplevel' to 'top-level', matching usage for
'top-level domain'.

Signed-off-by: Andrew Berry <andrew@furrypaws.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 17:20:02 -07:00
6a24cc71ed submodule--helper: remove redundant include
"dir.h" should have been included only once.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 14:01:08 -07:00
d5659f856f help: convert git_cmd to page in one place
Depending on the chosen format of help pages, git-help uses function
show_man_page, show_info_page, or show_html_page.  The first thing all
three functions do is to convert given `git_cmd` to a `page` using
function cmd_to_page.

Move the common part of these three functions to function cmd_help to
avoid code duplication.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Felipe Contreras <felipe.contreras@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 13:09:20 -07:00
5632e838f8 khash: clarify that allocations never fail
We use our standard allocation functions and macros (xcalloc,
ALLOC_ARRAY, REALLOC_ARRAY) in our version of khash.h.  They terminate
the program on error instead, so code that's using them doesn't have to
handle allocation failures.  Make this behavior explicit by turning
kh_resize_ into a void function and removing the related unreachable
error handling code.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 13:07:50 -07:00
52a47aea70 t7509: avoid direct file access for writing CHERRY_PICK_HEAD
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:56:38 -07:00
ae815940f6 t1415: avoid direct filesystem access for writing refs
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:56:38 -07:00
ba8c2680ec t6402: preserve git exit status code
In t6402, we're checking number of files in the index and the working
tree by piping the output of Git's command to "wc -l", thus losing the
exit status code of git.

Let's use the new helper test_stdout_line_count in order to preserve
Git's exit status code.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:24:12 -07:00
66c9562013 t6400: preserve git ls-files exit status code
In t6400, we're checking number of files in the index and the working
tree by piping the output of "git ls-files" to "wc -l", thus losing the
exit status code of git.

Let's use the newly introduced test_stdout_line_count in order to check
the exit status code of Git's command.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:24:11 -07:00
cdff1bb5a3 test-lib-functions: introduce test_stdout_line_count
In some tests, we're checking the number of lines in output of some
commands, including but not limited to Git's command.

We're doing the check by running those commands in the left side of
a pipe, thus losing the exit status code of those commands. Meanwhile,
we really want to check the exit status code of Git's command.

Let's write the output of those commands to a temporary file, and use
test_line_count separately in order to check exit status code of
those commands properly.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:22:27 -07:00
0dc787a9f2 ci: accelerate the checkout
By upgrading from v1 to v2 of `actions/checkout`, we avoid fetching all
the tags and the complete history: v2 only fetches one revision by
default. This should make things a lot faster.

Note that `actions/checkout@v2` seems to be incompatible with running in
containers: https://github.com/actions/checkout/issues/151. Therefore,
we stick with v1 there.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:20:58 -07:00
9ab0b66129 ci (vs-build): build with NO_GETTEXT
We already build Git for Windows with `NO_GETTEXT` when compiling with
GCC. Let's do the same with Visual C, too.

Note that we do not technically _need_ to pass `NO_GETTEXT` explicitly
in that `make artifacts-tar` invocation because we do this while `MSVC`
is set (which will set `uname_S := Windows`, which in turn will set
`NO_GETTEXT = YesPlease`). But it is definitely nicer to be explicit
here.

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Helped-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:20:58 -07:00
4348824059 artifacts-tar: respect NO_GETTEXT
We obviously do not want to bundle `.mo` files during `make
artifacts-tar NO_GETTEXT=Yep`, but that was the case.

To fix that, go a step beyond just fixing the symptom, and simply
define the lists of `.po` and `.mo` files as empty if `NO_GETTEXT` is
set.

Helped-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:20:58 -07:00
d681d0dc3a ci (windows): transfer also the Git-tracked files to the test jobs
Git's test suite is excruciatingly slow on Windows, mainly due to the
fact that it executes a lot of shell script code, and that's simply not
native to Windows.

To help with that, we established the pattern where the artifacts are
first built in one job, and then multiple test jobs run in parallel
using the artifacts built in the first job.

We take pains in transferring only the build outputs, and letting
`actions/checkout` fill in the rest of the files.

One major downside of that strategy is that the test jobs might fail to
check out the intended revision (e.g. because the branch has been
updated while the build was running, as is frequently the case with the
`seen` branch).

Let's transfer also the files tracked by Git, and skip the checkout step
in the test jobs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:20:58 -07:00
10b635b773 bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.

That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)

We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.

Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.

Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.

In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].

1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:10:17 -07:00
15e7c7dca6 bundle.c: use a temporary variable for OIDs and names
In preparation for moving away from accessing the OID and name via the
"oid" and "name" slots in a subsequent commit, change the code that
accesses it to use named variables. This makes the subsequent change
smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:10:16 -07:00
db6bfb9fe8 bundle cmd: stop leaking memory from parse_options_cmd_bundle()
Fix a memory leak from the prefix_filename() function introduced with
its use in 3b754eedd5 (bundle: use prefix_filename with bundle path,
2017-03-20).

As noted in that commit the leak was intentional as a part of being
sloppy about freeing resources just before we exit, I'm changing this
because I'll be fixing other memory leaks in the bundle API (including
the library version) in subsequent commits. It's easier to reason
about those fixes if valgrind runs cleanly at the end without any
leaks whatsoever.

An earlier version of this change[1] went out of its way to not leak
memory on the die() codepaths here, but doing so will only avoid
reports of potential leaks under heap-only leak trackers such as
valgrind, not the SANITIZE=leak mode.

Avoiding those leaks as well might be useful to enable us to run
cleanly under the likes of valgrind in the future. But for now the
relative verbosity of the resulting code, and the fact that we don't
have some valgrind or SANITIZE=leak mode as part of our CI (it's only
run ad-hoc, see [2]), means we're not worrying about that for now.

1. https://lore.kernel.org/git/87v95vdxrc.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:10:16 -07:00
f3abf6aa63 l10n: id.po: fix mismatched variable names
Jiang Xin notified me [1] to fix some typos. Running git-po-helper [2]
check, it found following:

  1. git submodule--helper typo (was git submodule==helper)
  2. new_index improper translation (DO NOT translate instead)

Fix them.

[1]:
https://lore.kernel.org/git/20210703111837.14894-1-worldhello.net@gmail.com/
[2]:
e44df847ab

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-07-05 10:02:34 +08:00
3663e5904d perf: fix when running with TEST_OUTPUT_DIRECTORY
When the TEST_OUTPUT_DIRECTORY is defined, then all test data will be
written in that directory instead of the default directory located in
"t/". While this works as expected for our normal tests, performance
tests fail to locate and aggregate performance data because they don't
know to handle TEST_OUTPUT_DIRECTORY correctly and always look at the
default location.

Fix the issue by adding a `--results-dir` parameter to "aggregate.perl"
which identifies the directory where results are and by making the "run"
script awake of the TEST_OUTPUT_DIRECTORY variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 15:47:30 -07:00
bc40dfb10a string-list.h users: change to use *_{nodup,dup}()
Change all in-tree users of the string_list_init(LIST, BOOL) API to
use string_list_init_{nodup,dup}(LIST) instead.

As noted in the preceding commit let's leave the now-unused
string_list_init() wrapper in-place for any in-flight users, it can be
removed at some later date.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01 12:32:22 -07:00
770fedaf9f string-list.[ch]: add a string_list_init_{nodup,dup}()
In order to use the new "memcpy() a 'blank' struct on the stack"
pattern for string_list_init(), and to make the macro initialization
consistent with the function initialization introduce two new
string_list_init_{nodup,dup}() functions. These are like the old
string_list_init() when called with a false and true second argument,
respectively.

I think this not only makes things more consistent, but also easier to
read. I often had to lookup what the ", 0)" or ", 1)" in these
invocations meant, now it's right there in the function name, and
corresponds to the macros.

A subsequent commit will convert existing API users to this pattern,
but as this is a very common API let's leave a compatibility function
in place for later removal. This intermediate state also proves that
the compatibility function works.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01 12:32:22 -07:00
ce93a4c612 dir.[ch]: replace dir_init() with DIR_INIT
Remove the dir_init() function and replace it with a DIR_INIT
macro. In many cases in the codebase we need to initialize things with
a function for good reasons, e.g. needing to call another function on
initialization. The "dir_init()" function was not one such case, and
could trivially be replaced with a more idiomatic macro initialization
pattern.

The only place where we made use of its use of memset() was in
dir_clear() itself, which resets the contents of an an existing struct
pointer. Let's use the new "memcpy() a 'blank' struct on the stack"
idiom to do that reset.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01 12:32:22 -07:00
5726a6b401 *.c *_init(): define in terms of corresponding *_INIT macro
Change the common patter in the codebase of duplicating the
initialization logic between an *_INIT macro and a
corresponding *_init() function to use the macro as the canonical
source of truth.

Now we no longer need to keep the function up-to-date with the macro
version. This implements a suggestion by Jeff King who found that
under -O2 [1] modern compilers will init new version in place without
the extra copy[1]. The performance of a single *_init() won't matter
in most cases, but even if it does we're going to be producing
efficient machine code to perform these operations.

1. https://lore.kernel.org/git/YNyrDxUO1PlGJvCn@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01 12:32:22 -07:00
3d97ea479f *.h: move some *_INIT to designated initializers
Move *_INIT macros I'll use in a subsequent commits to designated
initializers. This isn't required for those follow-up changes, but
since next commits will change things in this area, let's use the
modern pattern over the old one while we're at it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01 12:31:45 -07:00
560bf51892 test-lib: avoid accidental globbing in match_pattern_list()
We have a custom match_pattern_list() function which we use for matching
test names (like "t1234") against glob-like patterns (like "t1???") for
$GIT_SKIP_TESTS, --verbose-only, etc.

Those patterns may have multiple whitespace-separated elements (e.g.,
"t0* t1234 t5?78"). The callers of match_pattern_list thus pass the
strings unquoted, so that the shell does the usual field-splitting into
separate arguments.

But this also means the shell will do the usual globbing for each
argument, which can result in us seeing an expansion based on what's in
the filesystem, rather than the real pattern. For example, if I have the
path "t5000" in the filesystem, and you feed the pattern "t?000", that
_should_ match the string "t0000", but it won't after the shell has
expanded it to "t5000".

This has been a bug ever since that function was introduced. But it
didn't usually trigger since we typically use the function inside the
trash directory, which has a very limited set of files that are unlikely
to match. It became a lot easier to trigger after edc23840b0 (test-lib:
bring $remove_trash out of retirement, 2021-05-10), because now we match
$GIT_SKIP_TESTS before even entering the trash directory. So the t5000
example above can be seen with:

  GIT_SKIP_TESTS=t?000 ./t0000-basic.sh

which should skip all tests but doesn't.

We can fix this by using "set -f" to ask the shell not to glob (which is
in POSIX, so should hopefully be portable enough). We only want to do
this in a subshell (to avoid polluting the rest of the script), which
means we need to get the whole string intact into the match_pattern_list
function by quoting it. Arguably this is a good idea anyway, since it
makes it much more obvious that we intend to split, and it's not simply
sloppy scripting.

Diagnosed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01 12:29:32 -07:00
bff9703f0a Merge branch 'zh/cat-file-batch-fix' into zh/ref-filter-raw-data
* zh/cat-file-batch-fix:
  cat-file: merge two block into one
  cat-file: handle trivial --batch format with --batch-all-objects
2021-07-01 12:16:43 -07:00
60fadf8bd2 fetch: document the --negotiate-only option
There was no documentation for the --negotiate-only option added in
9c1e657a8f (fetch: teach independent negotiation (no packfile),
2021-05-04), only documentation for the related push.negotiation
option added in the following commit in 477673d6f3 (send-pack:
support push negotiation, 2021-05-04).

Let's document it, and update the cross-linking I'd added between
--negotiation-tip=* and 'fetch.negotiationAlgorithm' in
526608284a (fetch doc: cross-link two new negotiation options,
2018-08-01).

I think it would be better to say "in common with the remote" here
than "...the server", but the documentation for --negotiation-tip=*
above this talks about "the server", so let's continue doing that in
this related option. See 3390e42adb (fetch-pack: support negotiation
tip whitelist, 2018-07-02) for that documentation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-30 14:57:22 -07:00
1e5b5ea538 send-pack.c: move "no refs in common" abort earlier
Move the early return if we have no remote refs in send_pack()
earlier.

When this was added in 4c353e890c (Warn when send-pack does nothing,
2005-12-04) one of the first things we'd do was to abort, but as of
cfee10a773 (send-pack/receive-pack: allow errors to be reported back
to pusher., 2005-12-25) we've added numerous server_supports()
conditions that are acted on later in the function, that won't be used
if we don't have remote refs.

Then as of 477673d6f3 (send-pack: support push negotiation,
2021-05-04) we started doing even more work on the assumption that we
had some remote refs to feed to --negotiation-tip=* options.

We only hit this condition if we have nothing to push, so we don't
need to consider "push.negotiate" etc. only to do nothing with that
information.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-30 14:57:22 -07:00
3585d0ea23 merge-recursive: handle rename-to-self case
Directory rename detection can cause transitive renames, e.g. if the two
different sides of history each do one half of:
    A/file -> B/file
    B/     -> C/
then directory rename detection transitively renames to give us
    A/file -> C/file

However, when C/ == A/, note that this gives us
    A/file -> A/file.

merge-recursive assumed that any rename D -> E would have D != E.  While
that is almost always true, the above is a special case where it is not.
So we cannot do things like delete the rename source, we cannot assume
that a file existing at path E implies a rename/add conflict and we have
to be careful about what stages end up in the output.

This change feels a bit hackish.  It took me surprisingly many hours to
find, and given merge-recursive's design causing it to attempt to
enumerate all combinations of edge and corner cases with special code
for each combination, I'm worried there are other similar fixes needed
elsewhere if we can just come up with the right special testcase.
Perhaps an audit would rule it out, but I have not the energy.
merge-recursive deserves to die, and since it is on its way out anyway,
fixing this particular bug narrowly will have to be good enough.

Reported-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-30 14:40:10 -07:00
a492d5331c merge-ort: ensure we consult df_conflict and path_conflicts
Path conflicts (typically rename path conflicts, e.g.
rename/rename(1to2) or rename/add/delete), and directory/file conflicts
should obviously result in files not being marked as clean in the merge.
We had a codepath where we missed consulting the path_conflict and
df_conflict flags, based on match_mask.  Granted, it requires an unusual
setup to trigger this codepath (directory rename causing rename-to-self
is the only case I can think of), but we still need to handle it.  To
make it clear that we have audited the other codepaths that do not
explicitly mention these flags, add some assertions that the flags are
not set.

Reported-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-30 14:40:10 -07:00
806f83287f t6423: test directory renames causing rename-to-self
Directory rename detection can cause transitive renames, e.g. if the two
different sides of history each do one half of:
    A/file -> B/file
    B/     -> C/
then directory rename detection transitively renames to give us C/file.
Since the default for merge.directoryRenames is conflict, this results
in an error message saying it is unclear whether the file should be
placed at B/file or C/file.

What if C/ is A/, though?  In such a case, the transitive rename would
give us A/file, the original name we started with.  Logically, having
an error message with B/file vs. A/file should be fine, as should
leaving the file where it started.  But the logic in both
merge-recursive and merge-ort did not handle a case of a filename being
renamed to itself correctly; merge-recursive had two bugs, and merge-ort
had one.  Add some testcases covering such a scenario.

Based-on-testcase-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-30 14:40:09 -07:00
fe7fe62d8d grep: report missing left operand of --and
Git grep allows combining two patterns with --and.  It checks and
reports if the second pattern is missing when compiling the expression.
A missing first pattern, however, is only reported later at match time.
Thus no error is returned if no matching is done, e.g. because no file
matches the also given pathspec.

When that happens we get an expression tree with an GREP_NODE_AND node
and a NULL pointer to the missing left child.  free_pattern_expr()
tries to dereference it during the cleanup at the end, which results
in a segmentation fault.

Fix this by verifying the presence of the left operand at expression
compilation time.

Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-30 14:19:03 -07:00
dc05929411 xmmap: inform Linux users of tuning knobs on ENOMEM
Linux users may benefit from additional information on how to
avoid ENOMEM from mmap despite the system having enough RAM to
accomodate them.  We can't reliably unmap pack windows to work
around the issue since malloc and other library routines may
mmap without our knowledge.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-29 23:14:25 -07:00
c49a177bec test-lib.sh: set COLUMNS=80 for --verbose repeatability
Some tests will fail under --verbose because while we've unset COLUMNS
since b1d645b58a (tests: unset COLUMNS inherited from environment,
2012-03-27), we also look for the columns with an ioctl(..,
TIOCGWINSZ, ...) on some platforms. By setting COLUMNS again we
preempt the TIOCGWINSZ lookup in pager.c's term_columns(), it'll take
COLUMNS over TIOCGWINSZ,

This fixes t0500-progress-display.sh., which broke because of a
combination of the this issue and the progress output reacting to the
column width since 545dc345eb (progress: break too long progress bar
lines, 2019-04-12). The t5324-split-commit-graph.sh fails in a similar
manner due to progress output, see [1] for details.

The issue is not specific to progress.c, the diff code also checks
COLUMNS and some of its tests can be made to fail in a similar
manner[2], anything that invokes a pager is potentially affected.

See ea77e675e5 (Make "git help" react to window size correctly,
2005-12-18) and ad6c3739a3 (pager: find out the terminal width before
spawning the pager, 2012-02-12) for how the TIOCGWINSZ code ended up
in pager.c

1. http://lore.kernel.org/git/20210624051253.GG6312@szeder.dev
2. https://lore.kernel.org/git/20210627074419.GH6312@szeder.dev/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-29 13:06:30 -07:00
033395be32 Makefile: add QUIET_GEN to "cscope" target
Don't show the very verbose $(FIND_SOURCE_FILES) command on every
"make cscope" invocation.

See my recent 3c80fcb591 (Makefile: add QUIET_GEN to "tags" and "TAGS"
targets, 2021-03-28) for the same fix for the other adjacent targets.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-29 13:04:00 -07:00
922f8bbbf1 Makefile: move ".PHONY: cscope" near its target
Move the ".PHONY: cscope" rule to live alongside the "cscope" target
itself, not to be all the way near the bottom where we define the
"FORCE" rule.

That line was last modified in 2f76919517 (MinGW: avoid collisions
between "tags" and "TAGS", 2010-09-28).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-29 13:04:00 -07:00
7b76d6bf22 Makefile: add and use the ".DELETE_ON_ERROR" flag
Use the GNU make ".DELETE_ON_ERROR" flag in our main Makefile, as we
already do in the Documentation/Makefile since db10fc6c09 (doc:
simplify Makefile using .DELETE_ON_ERROR, 2021-05-21).

Now if a command to make X fails X will be removed, the default
behavior of GNU make is to only do so if "make" itself is interrupted
with a signal.

E.g. if we now intentionally break one of the rules with:

    -       mv $@+ $@
    +       mv $@+ $@ && \
    +       false

We'll get output like:

    $ make git
        CC git.o
        LINK git
    make: *** [Makefile:2179: git] Error 1
    make: *** Deleting file 'git'
    $ file git
    git: cannot open `git' (No such file or directory)

Before this change we'd leave the file in place in under this
scenario.

As in db10fc6c09 this allows us to remove patterns of removing
leftover $@ files at the start of rules, since previous failing runs
of the Makefile won't have left those littered around anymore.

I'm not as confident that we should be replacing the "mv $@+ $@"
pattern entirely, since that means that external programs or one of
our other Makefiles might race and get partial content.

I'm not changing $(REMOTE_CURL_ALIASES) since that uses a ln/ln -s/cp
dance, and would require the addition of "-f" flags if the "rm" at the
start was removed. I've also got plans to fix that ln/ln -s/cp pattern
in another series.

For $(LIB_FILE) and $(XDIFF_LIB) we can rely on the "c" (create) being
present in ARFLAGS.

I'm not changing "$(ETAGS_TARGET)", "tags" and "cscope" because
they've got a messy combination of removing "$@+" not "$@" at the
beginning, or "$@*". I'm also addressing those in another series.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-29 08:03:45 -07:00
f89ecf7988 midx: report checksum mismatches during 'verify'
'git multi-pack-index verify' inspects the data in an existing MIDX for
correctness by checking that the recorded object offsets are correct,
and so on.

But it does not check that the file's trailing checksum matches the data
that it records. So, if an on-disk corruption happened to occur in the
final few bytes (and all other data was recorded correctly), we would:

  - get a clean result from 'git multi-pack-index verify', but
  - be unable to reuse the existing MIDX when writing a new one (since
    we now check for checksum mismatches before reusing a MIDX)

Teach the 'verify' sub-command to recognize corruption in the checksum
by calling midx_checksum_valid().

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:36:17 -07:00
ec1e28ef9c midx: don't reuse corrupt MIDXs when writing
When writing a new multi-pack index, Git tries to reuse as much of the
data from an existing MIDX as possible, like object offsets. This is
done to avoid re-opening a bunch of *.idx files unnecessarily, but can
lead to problems if the data we are reusing is corrupt.

That's because we'll blindly reuse data from an existing MIDX without
checking its trailing checksum for validity. So if there is memory
corruption while writing a MIDX, or disk corruption in the intervening
period between writing and reuse, we'll blindly propagate those bad
values forward.

Suppose we experience a memory corruption while writing a MIDX such that
we write an incorrect object offset (or alternatively, the disk corrupts
the data after being written, but before being reused). Then when we go
to write a new MIDX, we'll reuse the bad object offset without checking
its validity. This means that the MIDX we just wrote is broken, but its
trailing checksum is in-tact, since we never bothered to look at the
values before writing.

In the above, a "git multi-pack-index verify" would have caught the
problem before writing, but writing a new MIDX wouldn't have noticed
anything wrong, blindly carrying forward the corrupt offset.

Individual pack indexes check their validity by verifying the crc32
attached to each entry when carrying data forward during a repack.
We could solve this problem for MIDXs in the same way, but individual
crc32's don't make much sense, since their entries are so small.
Likewise, checking the whole file on every read may be prohibitively
expensive if a repository has a lot of objects, packs, or both.

But we can check the trailing checksum when reusing an existing MIDX
when writing a new one. And a corrupt MIDX need not stop us from writing
a new one, since we can just avoid reusing the existing one at all and
pretend as if we are writing a new MIDX from scratch.

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:36:17 -07:00
15316a4732 commit-graph: rewrite to use checksum_valid()
Rewrite an existing caller in `git commit-graph verify` to take
advantage of checksum_valid().

Note that the replacement isn't a verbatim cut-and-paste, since the new
function avoids using hashfile at all and instead talks to the_hash_algo
directly, but it is functionally equivalent.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:36:17 -07:00
f9221e2cf5 csum-file: introduce checksum_valid()
Introduce a new function which checks the validity of a file's trailing
checksum. This is similar to hashfd_check(), but different since it is
intended to be used by callers who aren't writing the same data (like
`git index-pack --verify`), but who instead want to validate the
integrity of data that they are reading.

Rewrite the first of two callers which could benefit from this new
function in pack-check.c. Subsequent callers will be added in the
following patches.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:36:17 -07:00
e9f79acb28 ci: upgrade to using actions/{up,down}load-artifacts v2
The GitHub Actions to upload/download workflow artifacts saw a major
upgrade since Git's GitHub workflow was established. Let's use it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:35:40 -07:00
abb2b389f7 ci (vs-build): use cmd to copy the DLLs, not powershell
We use a `.bat` script to copy the DLLs in the `vs-build` job, and those
type of scripts are native to CMD, not to PowerShell.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:35:39 -07:00
0eb6c189a3 ci: use the new GitHub Action to download git-sdk-64-minimal
In our continuous builds, Windows is the odd cookie that requires a
complete development environment to be downloaded because there is no
suitable one installed by default on Windows.

Side note: technically, there _is_ a development environment present in
GitHub Actions' build agents: MSYS2. But it differs from Git for
Windows' SDK in subtle points, unfortunately enough so to prevent Git's
test suite from running without failures.

Traditionally, we support downloading this environment (which we
nicknamed `git-sdk-64-minimal`) via a PowerShell scriptlet that accesses
the build artifacts of a dedicated Azure Pipeline (which packages a tiny
subset of the full Git for Windows SDK, containing just enough to build
Git and run its test suite).

This PowerShell script is unfortunately not very robust and sometimes
fails due to network issues.

Of course, we could add code to detect that situation, wait a little,
try again, if it fails again wait a little longer, lather, rinse and
repeat.

Instead of doing all of this in Git's own `.github/workflows/`, though,
let's offload this logic to the new GitHub Action at
https://github.com/marketplace/actions/setup-git-for-windows-sdk

This Action not only downloads and extracts git-sdk-64-minimal _outside_
the worktree (making it no longer necessary to meddle with
`.gitignore` or `.git/info/exclude`), it also adds the `bash.exe` to the
`PATH` and sets the environment variable `MSYSTEM` (an implementation
detail that Git's workflow should never have needed to know about).

This allows us to convert all those funny PowerShell tasks that wanted
to call git-sdk-64-minimal's `bash.exe`: they all are now regular `bash`
scriptlets.

This finally lets us get rid of the funny quoting and escaping where we
had to pay attention not only to quote and escape the Bash scriptlets
properly, but also to add a second level of escaping (with backslashes
for double quotes and backticks for dollar signs) to stop PowerShell
from doing unintended things.

Further, this Action uses a fast caching strategy native to GitHub
Actions that should accelerate the download across CI runs:
git-sdk-64-minimal is usually updated once per 24h, and needs to be
cached only once within that period. Caching it (unfortunately only on
a per-branch basis) speeds up the download step, and makes it much more
robust at the same time by virtue of accessing a cache location that is
closer in the network topology.

With this we can drop the home-rolled caching where we try to accelerate
the test phase by uploading git-sdk-64-minimal as a workflow artifact
after using it to build Git, and then download it as workflow artifact
in the test phase.

Even better: the `vs-test` job no longer needs to depend on the
`windows-build` job. The only reason it depended on it was to ensure
that the `git-sdk-64-minimal` workflow artifact was available.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:35:39 -07:00
6afb265b96 add_ref_decoration(): rename s/type/deco_type/
Now that we have two types (a decoration type and an object type) in the
function, let's give them both unique names to avoid accidentally using
one instead of the other.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:32:32 -07:00
88473c8bae load_ref_decorations(): avoid parsing non-tag objects
When we load the ref decorations, we parse the object pointed to by each
ref in order to get a "struct object". This is unnecessarily expensive;
we really only need the object struct, and don't even look at the parsed
contents. The exception is tags, which we do need to peel.

We can improve this by looking up the object type first (which is much
cheaper), and skipping the parse entirely for non-tags. This increases
the work slightly for annotated tags (which now do a type lookup _and_ a
parse), but decreases it a lot for other types. On balance, this seems
to be a good tradeoff.

In my git.git clone, with ~2k refs, most of which are branches, the time
to run "git log -1 --decorate" drops from 34ms to 11ms. Even on my
linux.git clone, which contains mostly tags and only a handful of
branches, the time drops from 30ms to 19ms. And on a more extreme
real-world case with ~220k refs, mostly non-tags, the time drops from
2.6s to 650ms.

That command is a lop-sided example, of course, because it does as
little non-loading work as possible. But it does show the absolute time
improvement. Even in something like a full "git log --decorate" on that
extreme repo, we'd still be saving 2s of CPU time.

Ideally we could push this even further, and avoid parsing even tags, by
relying on the packed-refs "peel" optimization (which we could do by
calling peel_iterated_oid() instead of peeling manually). But we can't
do that here. The packed-refs file only stores the bottom-layer of the
peel (so in a "tag->tag->commit" chain, it stores only the commit as the
peel result).  But the decoration code wants to peel the layers
individually, annotating the middle layers of the chain.

If the packed-refs file ever learns to store all of the peeled layers,
then we could switch to it. Or even if it stored a flag to indicate the
peel was not multi-layer (because most of them aren't), then we could
use it most of the time and fall back to a manual peel for the rare
cases.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:31:40 -07:00
7463064b28 object.h: add lookup_object_by_type() function
In some cases it's useful for efficiency reasons to get the type of an
object before deciding whether to parse it, but we still want an object
struct. E.g., in reachable.c, bitmaps give us the type, but we just want
to mark flags on each object. Likewise, we may loop over every object
and only parse tags in order to peel them; checking the type first lets
us avoid parsing the non-tags.

But our lookup_blob(), etc, functions make getting an object struct
annoying: we have to call the right function for every type. And we
cannot just use the generic lookup_object(), because it only returns an
already-seen object; it won't allocate a new object struct.

Let's provide a function that dispatches to the correct lookup_*
function based on a run-time type. In fact, reachable.c already has such
a helper, so we'll just make that public.

I did change the return type from "void *" to "struct object *". While
the former is a clever way to avoid casting inside the function, it's
less safe and less informative to people reading the function
declaration.

The next commit will add a new caller.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:30:18 -07:00
542d6abbb4 object.h: expand docstring for lookup_unknown_object()
The lookup_unknown_object() system is not often used and is somewhat
confusing. Let's try to explain it a bit more (which is especially
important as I'm adding a related but slightly different function in the
next commit).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:30:17 -07:00
b2086b5183 log: avoid loading decorations for userformats that don't need it
If no --decorate option is given, we default to auto-decoration. And
when that kicks in, cmd_log_init_finish() will unconditionally load the
decoration refs.

However, if we are using a user-format that does not include "%d" or
"%D", we won't show the decorations at all, so we don't need to load
them. We can detect this case and auto-disable them by adding a new
field to our userformat_want helper. We can do this even when the user
explicitly asked for --decorate, because it can't affect the output at
all.

This patch consistently reduces the time to run "git log -1 --format=%H"
on my git.git clone (with ~2k refs) from 34ms to 7ms. On a much more
extreme real-world repository (with ~220k refs), it goes from 2.5s to
4ms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:30:17 -07:00
3c7e2e8f0a pretty.h: update and expand docstring for userformat_find_requirements()
The comment only mentions "notes", but there are more fields now (and
I'm about to add another). Let's make it more general, and stick the
struct next to the function to make the list of possibilities obvious.

While we're touching this comment, let's also mention the behavior of
NULL, which some callers rely on (though in the long run, this global is
pretty nasty and probably should get moved into rev_info).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:30:17 -07:00
1cf823d8f0 submodule: remove unnecessary prefix based option logic
Over time when parts of submodule have been ported from shell to
builtin, many instances of the submodule helper have been added.
Also added with them are some unnecessary option passing
logic that are based on the `prefix` shell variable which never
gets set in their code flows.

On analysis, the only shell functions which have a valid usage
for the `prefix` shell variable are:

    - cmd_update: which is the only function which sets the variable
      and thus uses it properly

    - cmd_init: which uses the variable via a call from cmd_update

So, remove the unnecessary option parsing logic based on the `prefix`
shell variable.

Signed-off-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:28:19 -07:00
fb20d4b126 pack-objects tests: cover blindspots in stdin handling
Cover blindspots in the testing of stdin handling, including the
"!len" condition added in b5d97e6b0a (pack-objects: run rev-list
equivalent internally., 2006-09-04). The codepath taken with --revs
and read_object_list_from_stdin() acts differently in some of these
common cases, let's test for those.

The "--stdin --revs" test being added here stresses the combination of
--stdin-packs and the revision.c --stdin argument, some of this was
covered in a test added in 339bce27f4 (builtin/pack-objects.c: add
'--stdin-packs' option, 2021-02-22), but let's make sure that
GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=true keeps erroring out about
--stdin, and it isn't picked up by the revision.c API's handling of
that option.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:27:27 -07:00
e355307692 config: normalize the path of the system gitconfig
Git for Windows is compiled with a runtime prefix, and that runtime
prefix is typically `C:/Program Files/Git/mingw64`. As we want the
system gitconfig to live in the sibling directory `etc`, we define the
relative path as `../etc/gitconfig`.

However, as reported by Philip Oakley, the output of `git config
--show-origin --system -l` looks rather ugly, as it shows the path as
`file:C:/Program Files/Git/mingw64/../etc/gitconfig`, i.e. with the
`mingw64/../` part.

By normalizing the path, we get a prettier path.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:11:51 -07:00
50101b93ca cmake(windows): set correct path to the system Git config
Currently, when Git for Windows is built with CMake, the system Git config is
expected in a different location than when building via `make`: the former
expects it to be in `<runtime-prefix>/mingw64/etc/gitconfig`, the latter in
`<runtime-prefix>/etc/gitconfig`.

Because of this, things like `git clone` do not work correctly (because cURL is
no longer able to find its certificate bundle that it needs to validate HTTPS
certificates). See the full bug report and discussion here:
https://github.com/git-for-windows/git/issues/3071#issuecomment-789261386.

This commit aligns the CMake-based build by mimicking what is already done in
`config.mak.uname`.

This closes https://github.com/git-for-windows/git/issues/3071.

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:11:50 -07:00
fb5e3378f8 mingw: move Git for Windows' system config where users expect it
Git for Windows' prefix is `/mingw64/` (or `/mingw32/` for 32-bit
versions), therefore the system config is located at the clunky location
`C:\Program Files\Git\mingw64\etc\gitconfig`.

This moves the system config into a more logical location: the `mingw64`
part of `C:\Program Files\Git\mingw64\etc\gitconfig` never made sense,
as it is a mere implementation detail. Let's skip the `mingw64` part and
move this to `C:\Program Files\Git\etc\gitconfig`.

Side note: in the rare (and not recommended) case a user chooses to
install 32-bit Git for Windows on a 64-bit system, the path will of
course be `C:\Program Files (x86)\Git\etc\gitconfig`.

Background: During the Git for Windows v1.x days, the system config was
located at `C:\Program Files (x86)\Git\etc\gitconfig`. With Git for
Windows v2.x, it moved to `C:\Program Files\Git\mingw64\gitconfig` (or
`C:\Program Files (x86)\Git\mingw32\gitconfig`). Rather than fixing it
back then, we tried to introduce a "Windows-wide" config, but that never
caught on.

Likewise, we move the system `gitattributes` into the same directory.

Obviously, we are cautious to do this only for the known install
locations `/mingw64` and `/mingw32`; If anybody wants to override that
while building their version of Git (e.g. via `make prefix=$HOME`), we
leave the default location of the system config and gitattributes alone.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:11:48 -07:00
ebbf5d2b70 config.mak.uname: PCRE1 cleanup
Style issue: a space was missing.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:10:47 -07:00
9fffc38583 Documentation: fix typo in the --patch option of the commit command
Typofix (chose -> choose) in the documentation of the patch option
under the commit command.

Signed-off-by: Beshr Kayali <me@beshr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 13:08:25 -07:00
9b6e2c8b98 pager: avoid setting COLUMNS when we're guessing its value
We query `TIOCGWINSZ` in Git to determine the correct value for
`COLUMNS`, and then set that environment variable.

If `TIOCGWINSZ` is not available, we fall back to the hard-coded value
80 _and still_ set the environment variable.

On Windows this is a problem. The reason is that Git for
Windows uses a version of `less` that relies on the MSYS2 runtime to
interact with the pseudo terminal (typically inside a MinTTY window,
which is also aware of the MSYS2 runtime). Both MinTTY and `less.exe`
interact with that pseudo terminal via `ioctl()` calls (which the MSYS2
runtime emulates even if there is no such thing on Windows).
Since https://github.com/gwsw/less/commit/bb0ee4e76c2, `less` prefers
the `COLUMNS` variable over asking ncurses itself.

But `git.exe` itself is _not_ aware of the MSYS2 runtime, or for that
matter of that pseudo terminal, and has no way to call `ioctl()` or
`TIOCGWINSZ`.

Therefore, `git.exe` will fall back to hard-coding 80 columns, no matter
what the actual terminal size is.

But `less.exe` is totally able to interact with the MSYS2 runtime and
would not actually require Git's help (which actually makes things
worse here). So let's not override `COLUMNS` on Windows.

Let's just not set `COLUMNS` unless we managed to query the actual value
from the terminal.

This fixes https://github.com/git-for-windows/git/issues/3235

Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 10:42:10 -07:00
98c7656a18 git-worktree.txt: fix typo in example path
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 10:05:15 -07:00
6fc5369263 t: fix typos in test messages
Both in t4258 and in t9001, the code of the tests following shows the
proper name for the configuration variables.  So use the correct names
in the test messages as well.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 10:05:14 -07:00
3fca954172 blame: correct name of config option in docs
As can be seen in files "Documentation/blame-options.txt" and
"builtin/blame.c", the name of this configuration option is
"blame.markUnblamableLines".

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 10:05:13 -07:00
ef830cc434 promisor-remote: teach lazy-fetch in any repo
This is one step towards supporting partial clone submodules.

Even after this patch, we will still lack partial clone submodules
support, primarily because a lot of Git code that accesses submodule
objects does so by adding their object stores as alternates, meaning
that any lazy fetches that would occur in the submodule would be done
based on the config of the superproject, not of the submodule. This also
prevents testing of the functionality in this patch by user-facing
commands. So for now, test this mechanism using a test helper.

Besides that, there is some code that uses the wrapper functions
like has_promisor_remote(). Those will need to be checked to see if they
could support the non-wrapper functions instead (and thus support any
repository, not just the_repository).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:58:01 -07:00
d1fa94356d run-command: refactor subprocess env preparation
submodule.c has functionality that prepares the environment for running
a subprocess in a new repo. The lazy-fetching code (used in partial
clones) will need this in a subsequent commit, so move it to a more
central location.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:58:01 -07:00
69bb2e1804 submodule: refrain from filtering GIT_CONFIG_COUNT
14111fc492 ("git: submodule honor -c credential.* from command line",
2016-03-01) taught Git to pass through the GIT_CONFIG_PARAMETERS
environment variable when invoking a subprocess on behalf of a
submodule. But when d8d77153ea ("config: allow specifying config entries
via envvar pairs", 2021-01-15) introduced support for GIT_CONFIG_COUNT
(and its associated GIT_CONFIG_KEY_? and GIT_CONFIG_VALUE_?), the
subprocess mechanism wasn't updated to also pass through these
variables.

Since they are conceptually the same (d8d77153ea was written to address
a shortcoming of GIT_CONFIG_PARAMETERS), update the submodule subprocess
mechanism to also pass through GIT_CONFIG_COUNT.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:57:54 -07:00
ef7dc2e9cc promisor-remote: support per-repository config
Instead of using global variables to store promisor remote information,
store this config in struct repository instead, and add
repository-agnostic non-static functions corresponding to the existing
non-static functions that only work on the_repository.

The actual lazy-fetching of missing objects currently does not work on
repositories other than the_repository, and will still not work after
this commit, so add a BUG message explaining this. A subsequent commit
will remove this limitation.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:57:42 -07:00
ebaf3bcf1a repository: move global r_f_p_c to repo struct
Move repository_format_partial_clone, which is currently a global
variable, into struct repository. (Full support for per-repository
partial clone config will be done in a subsequent commit - this is split
into its own commit because of the extent of the changes needed.)

The new repo-specific variable cannot be set in
check_repository_format_gently() (as is currently), because that
function does not know which repo it is operating on (or even whether
the value is important); therefore this responsibility is delegated to
the outermost caller that knows. Of all the outermost callers that know
(found by looking at all functions that call clear_repository_format()),
I looked at those that either read from the main Git directory or write
into a struct repository. These callers have been modified accordingly
(write to the_repository in the former case and write to the given
struct repository in the latter case).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:57:41 -07:00
54662d5958 git-p4: fix failed submit by skip non-text data files
If the submit contain binary files, it will throw exception and stop submit when try to append diff line description.

This commit will skip non-text data files when exception UnicodeDecodeError thrown.

The skip will not affect actual submit files in the resulting cl,
the diff line description will only appear in submit template,
so you can review what changed before actully submit to p4.

I don't know if add any message here will be helpful for users,
so I choose to just skip binary content, since it already append filename previously.

Signed-off-by: dorgon.chang <dorgonman@hotmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:49:30 -07:00
d65aea37d9 show-branch tests: add missing tests
Add missing tests for --remotes, --list and --merge-base. These are
not exhaustive, but better than the nothing we have now.

There were some tests for this command added in f76412ed6d ([PATCH]
Add 'git show-branch'., 2005-08-21) has never been properly tested,
namely for the --all option in t6432-merge-recursive-space-options.sh,
and some of --merge-base and --independent in t6010-merge-base.sh.

This fixes a few more blind spots, but there's still a lot of behavior
that's not tested for.

These new tests show the odd (and possibly unintentional) behavior of
--merge-base with one argument, and how its output is the same as "git
merge-base" with N bases in this particular case. See the test added
in f621a8454d (git-merge-base/git-show-branch --merge-base:
Documentation and test, 2009-08-05) for a case where the two aren't
the same.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:33:06 -07:00
4465690cd8 show-branch: don't <COLOR></RESET> for space characters
Change the colored output introduced in ab07ba2a24 (show-branch: color
the commit status signs, 2009-04-22) to not color and reset each
individual space character we use for padding. The intent is to color
just the "!", "+" etc. characters.

This makes the output easier to test, so let's do that now. The test
would be much more verbose without a color/reset for each space
character. Since the coloring cycles through colors we previously had
a "rainbow of space characters".

In theory this breaks things for anyone who's relying on the exact
colored output of show-branch, in practice I'd think anyone parsing it
isn't actively turning on the colored output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:33:06 -07:00
2f61b3eef3 mktag tests: test fast-export
Pass the bad tags we've created in the mktag tests through
fast-export, it will die on the bad object or ref, let's make sure
that happens.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:30:41 -07:00
b48015b340 mktag tests: test for-each-ref
Add a "for-each-ref" for all the mktag tests. This test would have
caught the segfault which was fixed in c685450880 (ref-filter: fix
NULL check for parse object failure, 2021-04-01). Let's make sure we
test that code more exhaustively.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:30:41 -07:00
eddc1f556c mktag tests: test update-ref and reachable fsck
Extend the mktag tests to pass the created bad tag through update-ref
and fsck.

The reason for passing it through update-ref is to guard against it
having a segfault as for-each-ref did before c685450880 (ref-filter:
fix NULL check for parse object failure, 2021-04-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:30:41 -07:00
47c0cb1a5d mktag tests: test hash-object --literally and unreachable fsck
Extend the mktag tests to pass the tag we've created through both
hash-object --literally and fsck.

This checks that fsck itself will not complain about certain invalid
content if a reachable tip isn't involved. Due to how fsck works and
walks the graph the failure will be different if the object is
reachable, so we might succeed before we've created the ref.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:30:41 -07:00
2bff554b23 merge-ort: add prefetching for content merges
Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-05)
introduced batching of fetching missing blobs, so that the diff
machinery would have one fetch subprocess grab N blobs instead of N
processes each grabbing 1.

However, the diff machinery is not the only thing in a merge that needs
to work on blobs.  The 3-way content merges need them as well.  Rather
than download all the blobs 1 at a time, prefetch all the blobs needed
for regular content merges.

This does not cover all possible paths in merge-ort that might need to
download blobs.  Others include:
  - The blob_unchanged() calls to avoid modify/delete conflicts (when
    blob renormalization results in an "unchanged" file)
  - Preliminary content merges needed for rename/add and
    rename/rename(2to1) style conflicts.  (Both of these types of
    conflicts can result in nested conflict markers from the need to do
    two levels of content merging; the first happens before our new
    prefetch_for_content_merges() function.)

The first of these wouldn't be an extreme amount of work to support, and
even the second could be theoretically supported in batching, but all of
these cases seem unusual to me, and this is a minor performance
optimization anyway; in the worst case we only get some of the fetches
batched and have a few additional one-off fetches.  So for now, just
handle the regular 3-way content merges in our prefetching.

For the testcase from the previous commit, the number of downloaded
objects remains at 63, but this drops the number of fetches needed from
32 down to 20, a sizeable reduction.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 07:58:25 -07:00
1aedd03afb diffcore-rename: use a different prefetch for basename comparisons
merge-ort was designed to minimize the amount of data needed and used,
and several changes were made to diffcore-rename to take advantage of
extra metadata to enable this data minimization (particularly the
relevant_sources variable for skipping "irrelevant" renames).  This
effort obviously succeeded in drastically reducing computation times,
but should also theoretically allow partial clones to download much less
information.  Previously, though, the "prefetch" command used in
diffcore-rename had never been modified and downloaded many blobs that
were unnecessary for merge-ort.  This commit corrects that.

When doing basename comparisons, we want to fetch only the objects that
will be used for basename comparisons.  If after basename fetching this
leaves us with no more relevant sources (or no more destinations), then
we won't need to do the full inexact rename detection and can skip
downloading additional source and destination files.  Even if we have to
do that later full inexact rename detection, irrelevant sources are
culled after basename matching and before the full inexact rename
detection, so we can still avoid downloading the blobs for irrelevant
sources.  Rename prefetch() to inexact_prefetch(), and introduce a
new basename_prefetch() to take advantage of this.

If we modify the testcase from commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2021-01-23)
to pass
    --sparse --filter=blob:none
to the clone command, and use the new trace2 "fetch_count" output from
a few commits ago to track both the number of fetch subcommands invoked
and the number of objects fetched across all those fetches, then for
the mega-renames testcase we observe the following:

BEFORE this commit, rebasing 35 patches:
    strategy     # of fetches    total # of objects fetched
    ---------    ------------    --------------------------
    recursive    62              11423
    ort          30              11391

AFTER this commit, rebasing the same 35 patches:
    ort          32                 63

This means that the new code only needs to download less than 2 blobs
per patch being rebased.  That is especially interesting given that the
repository at the start only had approximately half a dozen TOTAL blobs
downloaded to start with (because the default sparse-checkout of just
the toplevel directory was in use).

So, for this particular linux kernel testcase that involved ~26,000
renames on the upstream side (drivers/ -> pilots/) across which 35
patches were being rebased, this change reduces the number of blobs that
need to be downloaded by a factor of ~180.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 07:58:25 -07:00
d331dd3b0c diffcore-rename: allow different missing_object_cb functions
estimate_similarity() was setting up a diff_populate_filespec_options
every time it was called, requiring the caller of estimate_similarity()
to pass in some data needed to set up this option.  Currently the needed
data consisted of a single variable (skip_unmodified), but we want to
also have the different estimate_similarity() callsites start using
different missing_object_cb functions as well.  Rather than also passing
that data in, just have the caller pass in the whole
diff_populate_filespec_options, and reduce the number of times we need to
set it up.

As a side note, this also drops the number of calls to
has_promisor_remote() dramatically.  If L is the number of basename
paths to compare, M is the number of inexact sources, and N is the
number of inexact destinations, then the number of calls to
has_promisor_remote() drops from L+M*N down to at most 2 -- one for each
of the sites that calls estimate_similarity().  has_promisor_remote() is
a very fast function so this almost certainly has no measurable
performance impact, but it seems cleaner to avoid calling that function
so many times.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 07:58:25 -07:00
c75c423952 t6421: add tests checking for excessive object downloads during merge
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 07:58:25 -07:00
9146ef75dd l10n: fixed tripple-letter typos
Andrei found that the word "shallow" has an extra letter "l" in
"po/zh_CN.po". There are similar typos in other l10n files.

Reported-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-21 09:46:18 +08:00
a7d18a1109 pull: trivial whitespace style fix
Two spaces unaligned to anything is not part of the coding-style. A
single tab is.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 16:36:17 +09:00
a751e0296f pull: trivial cleanup
There's no need to store ran_ff. Now it's obvious from the conditionals.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 16:36:17 +09:00
340062243a pull: cleanup autostash check
Currently "git pull --rebase" takes a shortcut in the case a
fast-forward merge is possible; run_merge() is called with --ff-only.

However, "git merge" didn't have an --autostash option, so, when "git
pull --rebase --autostash" was called *and* the fast-forward merge
shortcut was taken, then the pull failed.

This was fixed in commit f15e7cf5cc (pull: ff --rebase --autostash
works in dirty repo, 2017-06-01) by simply skipping the fast-forward
merge shortcut.

Later on "git merge" learned the --autostash option [a03b55530a
(merge: teach --autostash option, 2020-04-07)], and so did "git pull"
[d9f15d37f1 (pull: pass --autostash to merge, 2020-04-07)].

Therefore it's not necessary to skip the fast-forward merge shortcut
anymore when called with --rebase --autostash.

Let's always take the fast-forward merge shortcut by essentially
reverting f15e7cf5cc.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 16:36:16 +09:00
cea232194d completion: bash: fix late declaration of __git_cmd_idx
A recent update to contrib/completion/git-completion.bash causes bash to fail
auto complete custom commands that are wrapped with __git_func_wrap. Declaring
__git_cmd_idx=0 inside __git_func_wrap resolves the issue.

Signed-off-by: Fabian Wermelinger <fabianw@mavt.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 15:30:35 +09:00
7c0afdf23c t: use portable wrapper for readlink(1)
Not all systems have a readlink program available for use by the shell.
This causes t3210 to fail on at least AIX. Let's provide a perl
one-liner to do the same thing, and use it there.

I also updated calls in t9802. Nobody reported failure there, but it's
the same issue. Presumably nobody actually tests with p4 on AIX in the
first place (if it is even available there).

I left the use of readlink in the "--valgrind" setup in test-lib.sh, as
valgrind isn't available on exotic platforms anyway (and I didn't want
to increase dependencies between test-lib.sh and test-lib-functions.sh).

There's one other curious case. Commit d2addc3b96 (t7800: readlink may
not be available, 2016-05-31) fixed a similar case. We can't use our
wrapper function there, though, as it's inside a sub-script triggered by
Git. It uses a slightly different technique ("ls" piped to "sed"). I
chose not to use that here as it gives confusing "ls -l" output if the
file is unexpectedly not a symlink (which is OK for its limited use, but
potentially confusing for general use within the test suite). The perl
version emits the empty string.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 15:26:05 +09:00
12d6991cf4 test: refactor to use "get_abbrev_oid" to get abbrev oid
Add new function "get_abbrev_oid" to get abbrev object ID.  This
function has a default value which helps to prepare a nonempty replace
pattern for sed command.  An empty replace pattern may cause sed fail
to allocate memory.

Refactor function "make_user_friendly_and_stable_output" to use
"get_abbrev_oid" to get abbrev object ID.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-17 14:12:24 +09:00
3c06a58339 test: refactor to use "test_commit" to create commits
Refactor function "create_commits_in" to use "test_commit" to create
commit.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-17 14:12:22 +09:00
2bafb3d702 test: compare raw output, not mangle tabs and spaces
Before comparing with the expect file, we used to call function
"make_user_friendly_and_stable_output" to filter out trailing spaces in
output.  Ævar recommends using pattern "s/Z$//" to prepare expect file,
and then compare it with raw output.

Since we have fixed the issue of occasionally missing the clear-to-eol
suffix when displaying sideband #2 messages, it is safe and stable to
test against raw output.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-17 14:12:21 +09:00
5210225f25 sideband: don't lose clear-to-eol at packet boundary
When "demultiplex_sideband()" sees a nonempty message ending with CR or
LF on the sideband #2, it adds "suffix" string to clear to the end of
the current line, which helps when relaying a progress display whose
records are terminated with CRs.  But if it sees a single LF, no
clear-to-end suffix should be appended, because this single LF is used
to end the progress display by moving to the next line, and the final
progress display above should be preserved.

However, the code forgot that depending on the length of the payload
line, such a CR may fall exactly at the packet boundary and the
number of bytes before the CR from the beginning of the packet could
be zero.  In such a case, the message that was terminated by the CR
were leftover in the "scratch" buffer in the previous call to the
function and we still need to clear to the end of the current line.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-17 14:11:36 +09:00
eb87c6f559 t6020: fix incompatible parameter expansion
Ævar reported that the function `make_user_friendly_and_stable_output()`
failed on a i386 box (gcc45) in the gcc farm boxes with error:

    sed: couldn't re-allocate memory

It turns out that older versions of bash (4.3) or dash (0.5.7) cannot
evaluate expression like `${A%${A#???????}}` used to get the leading 7
characters of variable A.

Replace the incompatible parameter expansion so that t6020 works on
older version of bash or dash.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-17 14:09:43 +09:00
c4e317814f userdiff: add support for C# record types
Records are added in C# 9

Code example :

    public record Person(string FirstName, string LastName);

For more information, see:
* https://docs.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-9

Signed-off-by: Julian Verdurmen <julian.verdurmen@outlook.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-16 17:06:20 +09:00
78cfdd0cf5 promisor-remote: output trace2 statistics for number of objects fetched
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-16 16:17:13 +09:00
46a237f42f *: fix typos
These typos were found while searching the codebase for gendered
pronouns. In the case of t9300-fast-import.sh, remove a confusing
comment that is unnecessary to the understanding of the test.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-16 11:26:20 +09:00
0e20b229ee comments: avoid using the gender of our users
We generally avoid specifying the gender of our users in order to be
more inclusive, but sometimes a few slip by due to habit.

Since by doing a little bit of rewording we can avoid this irrelevant
detail, let's do so.

Inspired-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-16 11:25:11 +09:00
69b3367f6c doc: avoid using the gender of other people
Using gendered pronouns for an anonymous person applies a gender where
none is known and further excludes readers who do not use gendered
pronouns. Avoid such examples in the documentation by using "they" or
passive voice to avoid the need for a pronoun.

Inspired-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-16 11:25:09 +09:00
9eb542f2ee gc tests: add a test for the "pre-auto-gc" hook
Add a missing test for the behavior of the pre-auto-gc hook added in
0b85d92661 (Documentation/hooks: add pre-auto-gc hook, 2008-04-02).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-16 10:25:12 +09:00
aac578492d pre-commit hook tests: don't leave "actual" nonexisting on failure
Start by creating an "actual" file in a core.hooksPath test that has
the hook echoing to the "actual" file.

We later test_cmp that file to see what hooks were run. If we fail to
run our hook(s) we'll have an empty list of hooks for the test_cmp
instead of a nonexisting file. For the logic of this test that makes more sense.

See 867ad08a26 (hooks: allow customizing where the hook directory is,
2016-05-04) for the commit that added these tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-16 10:24:39 +09:00
9853830787 graph: improve grammar of "invalid color" error message
Without the "d", it sounds like a command, not an error, and is liable
to be translated incorrectly.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 12:54:26 +09:00
9b6e74a9c0 show-branch tests: modernize test code
Modernize test code added in ce567d1867 (Add test to show that
show-branch misses out the 8th column, 2008-07-23) and
11ee57bc4c (sort_in_topological_order(): avoid setting a commit flag,
2008-07-23) to use test helpers.

I'm renaming "out" to "actual" for consistency with other tests, and
introducing a "branches.sorted" file in the setup, to make it clear
that it's important that the list be sorted in this particular way.

The "show-branch" output is indented with spaces, which would cause
complaints under "git show --check" with an indented here-doc
block. Let's prefix the lines with "> " to work around that, and to
make it clear that the leading whitespace is important.

We can also get rid of the hardcoding of "main" added here in
334afbc76f (tests: mark tests relying on the current default for
`init.defaultBranch`, 2020-11-18). For this test we're setting up an
"initial" commit anyway, and now that we've moved over to test_commit
we can reference that instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 12:12:01 +09:00
4f5ce122ac show-branch tests: rename the one "show-branch" test file
Rename the only *show-branch* test file to indicate that more tests
belong it in than just the one-off octopus test it now contains.

The test was initially added in ce567d1867 (Add test to show that
show-branch misses out the 8th column, 2008-07-23) and
11ee57bc4c (sort_in_topological_order(): avoid setting a commit flag,
2008-07-23). Those two add almost the same content, one with a
test_expect_success and the other a test_expect_failure (a bug being
tested for was fixed on one of the branches).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 12:12:01 +09:00
4dbc55e87d builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints
report_result() sends a struct to the parent process, but that struct
would contain uninitialised padding bytes. Running this code under MSAN
rightly triggers a warning - but we don't particularly care about this
warning because we control the receiving code, and we therefore know
that those padding bytes won't be read on the receiving end.

We could simply suppress this warning under MSAN with the approporiate
ifdef'd attributes, but a less intrusive solution is to 0-initialise the
struct, which guarantees that the padding will also be initialised.

Interestingly, in the error-case branch, we only try to copy the first
two members of pc_item_result, by copying only PC_ITEM_RESULT_BASE_SIZE
bytes. However PC_ITEM_RESULT_BASE_SIZE is defined as
'offsetof(the_last_member)', which means that we're copying padding bytes
after the end of the second last member. We could avoid doing this by
redefining PC_ITEM_RESULT_BASE_SIZE as
'offsetof(second_last_member) + sizeof(second_last_member)', but there's
no huge benefit to doing so (and this patch silences the MSAN warning in
this scenario either way).

MSAN output from t2080 (partially interleaved due to the
parallel work :) ):

Uninitialized bytes in __interceptor_write at offset 12 inside [0x7fff37d83408, 160)
==23279==WARNING: MemorySanitizer: use-of-uninitialized-value
Uninitialized bytes in __interceptor_write at offset 12 inside [0x7ffdb8a07ec8, 160)
==23280==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0xd5ac28 in xwrite /home/ahunt/git/git/wrapper.c:256:8
    #1 0xd5b327 in write_in_full /home/ahunt/git/git/wrapper.c:311:21
    #2 0xb0a8c4 in do_packet_write /home/ahunt/git/git/pkt-line.c:221:6
    #3 0xb0a5fd in packet_write /home/ahunt/git/git/pkt-line.c:242:6
    #4 0x4f7441 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:69:2
    #5 0x4f6be6 in worker_loop /home/ahunt/git/git/builtin/checkout--worker.c💯3
    #6 0x4f68d3 in cmd_checkout__worker /home/ahunt/git/git/builtin/checkout--worker.c:143:2
    #7 0x4a1e76 in run_builtin /home/ahunt/git/git/git.c:461:11
    #8 0x49e1e7 in handle_builtin /home/ahunt/git/git/git.c:714:3
    #9 0x4a0c08 in run_argv /home/ahunt/git/git/git.c:781:4
    #10 0x49d5a8 in cmd_main /home/ahunt/git/git/git.c:912:19
    #11 0x7974da in main /home/ahunt/git/git/common-main.c:52:11
    #12 0x7f8778114349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #13 0x421bd9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120

  Uninitialized value was created by an allocation of 'res' in the stack frame of function 'report_result'
    #0 0x4f72c0 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:55

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ahunt/git/git/wrapper.c:256:8 in xwrite
Exiting
    #0 0xd5ac28 in xwrite /home/ahunt/git/git/wrapper.c:256:8
    #1 0xd5b327 in write_in_full /home/ahunt/git/git/wrapper.c:311:21
    #2 0xb0a8c4 in do_packet_write /home/ahunt/git/git/pkt-line.c:221:6
    #3 0xb0a5fd in packet_write /home/ahunt/git/git/pkt-line.c:242:6
    #4 0x4f7441 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:69:2
    #5 0x4f6be6 in worker_loop /home/ahunt/git/git/builtin/checkout--worker.c💯3
    #6 0x4f68d3 in cmd_checkout__worker /home/ahunt/git/git/builtin/checkout--worker.c:143:2
    #7 0x4a1e76 in run_builtin /home/ahunt/git/git/git.c:461:11
    #8 0x49e1e7 in handle_builtin /home/ahunt/git/git/git.c:714:3
    #9 0x4a0c08 in run_argv /home/ahunt/git/git/git.c:781:4
    #10 0x49d5a8 in cmd_main /home/ahunt/git/git/git.c:912:19
    #11 0x7974da in main /home/ahunt/git/git/common-main.c:52:11
    #12 0x7f2749a0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #13 0x421bd9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120

  Uninitialized value was created by an allocation of 'res' in the stack frame of function 'report_result'
    #0 0x4f72c0 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:55

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ahunt/git/git/wrapper.c:256:8 in xwrite

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 12:07:56 +09:00
09751bf1b2 split-index: use oideq instead of memcmp to compare object_id's
cache_entry contains an object_id, and compare_ce_content() would
include that field when calling memcmp on a subset of the cache_entry.
Depending on which hashing algorithm is being used, only part of
object_id.hash is actually being used, therefore including it in a
memcmp() is incorrect. Instead we choose to exclude the object_id when
calling memcmp(), and call oideq() separately.

This issue was found when running t1700-split-index with MSAN, see MSAN
output below (on my machine, offset 76 corresponds to 4 bytes after the
start of object_id.hash).

Uninitialized bytes in MemcmpInterceptorCommon at offset 76 inside [0x7f60e7c00118, 92)
==27914==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4524ee in memcmp /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/msan/../sanitizer_common/sanitizer_common_interceptors.inc:873:10
    #1 0xc867ae in compare_ce_content /home/ahunt/git/git/split-index.c:208:8
    #2 0xc859fb in prepare_to_write_split_index /home/ahunt/git/git/split-index.c:336:9
    #3 0xb4bbca in write_split_index /home/ahunt/git/git/read-cache.c:3107:2
    #4 0xb42b4d in write_locked_index /home/ahunt/git/git/read-cache.c:3295:8
    #5 0x638058 in try_merge_strategy /home/ahunt/git/git/builtin/merge.c:758:7
    #6 0x63057f in cmd_merge /home/ahunt/git/git/builtin/merge.c:1663:9
    #7 0x4a1e76 in run_builtin /home/ahunt/git/git/git.c:461:11
    #8 0x49e1e7 in handle_builtin /home/ahunt/git/git/git.c:714:3
    #9 0x4a0c08 in run_argv /home/ahunt/git/git/git.c:781:4
    #10 0x49d5a8 in cmd_main /home/ahunt/git/git/git.c:912:19
    #11 0x7974da in main /home/ahunt/git/git/common-main.c:52:11
    #12 0x7f60e928e349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #13 0x421bd9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120

  Uninitialized value was stored to memory at
    #0 0x447eb9 in __msan_memcpy /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/msan/msan_interceptors.cpp:1558:3
    #1 0xb4d1e6 in dup_cache_entry /home/ahunt/git/git/read-cache.c:3457:2
    #2 0xd214fa in add_entry /home/ahunt/git/git/unpack-trees.c:215:18
    #3 0xd1fae0 in keep_entry /home/ahunt/git/git/unpack-trees.c:2276:2
    #4 0xd1ff9e in twoway_merge /home/ahunt/git/git/unpack-trees.c:2504:11
    #5 0xd27028 in call_unpack_fn /home/ahunt/git/git/unpack-trees.c:593:12
    #6 0xd2443d in unpack_nondirectories /home/ahunt/git/git/unpack-trees.c:1106:12
    #7 0xd19435 in unpack_callback /home/ahunt/git/git/unpack-trees.c:1306:6
    #8 0xd0d7ff in traverse_trees /home/ahunt/git/git/tree-walk.c:532:17
    #9 0xd1773a in unpack_trees /home/ahunt/git/git/unpack-trees.c:1683:9
    #10 0xdc6370 in checkout /home/ahunt/git/git/merge-ort.c:3590:8
    #11 0xdc51c3 in merge_switch_to_result /home/ahunt/git/git/merge-ort.c:3728:7
    #12 0xa195a9 in merge_ort_recursive /home/ahunt/git/git/merge-ort-wrappers.c:58:2
    #13 0x637fff in try_merge_strategy /home/ahunt/git/git/builtin/merge.c:751:12
    #14 0x63057f in cmd_merge /home/ahunt/git/git/builtin/merge.c:1663:9
    #15 0x4a1e76 in run_builtin /home/ahunt/git/git/git.c:461:11
    #16 0x49e1e7 in handle_builtin /home/ahunt/git/git/git.c:714:3
    #17 0x4a0c08 in run_argv /home/ahunt/git/git/git.c:781:4
    #18 0x49d5a8 in cmd_main /home/ahunt/git/git/git.c:912:19
    #19 0x7974da in main /home/ahunt/git/git/common-main.c:52:11

  Uninitialized value was created by a heap allocation
    #0 0x44e73d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/msan/msan_interceptors.cpp:901:3
    #1 0xd592f6 in do_xmalloc /home/ahunt/git/git/wrapper.c:41:8
    #2 0xd59248 in xmalloc /home/ahunt/git/git/wrapper.c:62:9
    #3 0xa17088 in mem_pool_alloc_block /home/ahunt/git/git/mem-pool.c:22:6
    #4 0xa16f78 in mem_pool_init /home/ahunt/git/git/mem-pool.c:44:3
    #5 0xb481b8 in load_all_cache_entries /home/ahunt/git/git/read-cache.c
    #6 0xb44d40 in do_read_index /home/ahunt/git/git/read-cache.c:2298:17
    #7 0xb48a1b in read_index_from /home/ahunt/git/git/read-cache.c:2389:8
    #8 0xbd5a0b in repo_read_index /home/ahunt/git/git/repository.c:276:8
    #9 0xb4bcaf in repo_read_index_unmerged /home/ahunt/git/git/read-cache.c:3326:2
    #10 0x62ed26 in cmd_merge /home/ahunt/git/git/builtin/merge.c:1362:6
    #11 0x4a1e76 in run_builtin /home/ahunt/git/git/git.c:461:11
    #12 0x49e1e7 in handle_builtin /home/ahunt/git/git/git.c:714:3
    #13 0x4a0c08 in run_argv /home/ahunt/git/git/git.c:781:4
    #14 0x49d5a8 in cmd_main /home/ahunt/git/git/git.c:912:19
    #15 0x7974da in main /home/ahunt/git/git/common-main.c:52:11
    #16 0x7f60e928e349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/msan/../sanitizer_common/sanitizer_common_interceptors.inc:873:10 in memcmp
Exiting

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 12:07:56 +09:00
6a748c2c66 mktag tests: invert --no-strict test
Change the mktag --no-strict test to actually test success under
--no-strict, that test was added in 06ce79152b (mktag: add a
--[no-]strict option, 2021-01-06).

It doesn't make sense to check that we have the same failure except
when we want --no-strict, by doing that we're assuming that the
behavior will be different under --no-strict, bun nothing was testing
for that.

We should instead assert that --strict is the same as --no-strict,
except in the cases where we've declared that it's not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 12:06:48 +09:00
fce3b089df mktag tests: parse out options in helper
Change check_verify_failure() helper to parse out options from
$@. This makes it easier to add new options in the future. See
06ce79152b (mktag: add a --[no-]strict option, 2021-01-06) for the
initial implementation.

Let's also replace "" quotes with '' for the test body, the varables
we need are eval'd into the body, so there's no need for the quoting
confusion.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 12:06:47 +09:00
77f37de39f subtree: fix assumption about the directory separator
On Windows, both forward and backslash are valid separators. In
22d5507493 (subtree: don't fuss with PATH, 2021-04-27), however, we
added code that assumes that it can only be the forward slash.

Let's fix that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 11:38:28 +09:00
f7ee88f1d0 subtree: fix the GIT_EXEC_PATH sanity check to work on Windows
In 22d5507493 (subtree: don't fuss with PATH, 2021-04-27), `git
subtree` was broken thoroughly on Windows.

The reason is that it assumes Unix semantics, where `PATH` is
colon-separated, and it assumes that `$GIT_EXEC_PATH:` is a verbatim
prefix of `$PATH`. Neither are true, the latter in particular because
`GIT_EXEC_PATH` is a Windows-style path, while `PATH` is a Unix-style
path list.

Let's make extra certain that `$GIT_EXEC_PATH` and the first component
of `$PATH` refer to different entities before erroring out.

We do that by using the `test <path1> -ef <path2>` command that verifies
that the inode of `<path1>` and of `<path2>` is the same.

Sadly, this construct is non-portable, according to
https://pubs.opengroup.org/onlinepubs/009695399/utilities/test.html.
However, it does not matter in practice because we still first look
whether `$GIT_EXEC_PREFIX` is string-identical to the first component of
`$PATH`. This will give us the expected result everywhere but in Git for
Windows, and Git for Windows' own Bash _does_ handle the `-ef` operator.

Just in case that we _do_ need to show the error message _and_ are
running in a shell that lacks support for `-ef`, we simply suppress the
error output for that part.

This fixes https://github.com/git-for-windows/git/issues/3260

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 11:38:26 +09:00
aa9ad6fee5 bitmaps: don't recurse into trees already in the bitmap
If an object is already mentioned in a reachability bitmap we are
building, then by definition so are all of the objects it can reach. We
have an optimization to stop traversing commits when we see they are
already in the bitmap, but we don't do the same for trees.

It's generally unavoidable to recurse into trees for commits not yet
covered by bitmaps (since most commits generally do have unique
top-level trees). But they usually have subtrees that are shared with
other commits (i.e., all of the subtrees the commit _didn't_ touch). And
some of those commits (and their trees) may be covered by the bitmap.

Usually this isn't _too_ big a deal, because we'll visit those subtrees
only once in total for the whole walk. But if you have a large number of
unbitmapped commits, and if your tree is big, then you may end up
opening a lot of sub-trees for no good reason.

We can use the same optimization we do for commits here: when we are
about to open a tree, see if it's in the bitmap (either the one we are
building, or the "seen" bitmap which covers the UNINTERESTING side of
the bitmap when doing a set-difference).

This works especially well because we'll visit all commits before
hitting any trees. So even in a history like:

  A -- B

if "A" has a bitmap on disk but "B" doesn't, we'll already have OR-ed in
the results from A before looking at B's tree (so we really will only
look at trees touched by B).

For most repositories, the timings produced by p5310 are unspectacular.
Here's linux.git:

  Test                         HEAD^             HEAD
  --------------------------------------------------------------------
  5310.4: simulated clone      6.00(5.90+0.10)   5.98(5.90+0.08) -0.3%
  5310.5: simulated fetch      2.98(5.45+0.18)   2.85(5.31+0.18) -4.4%
  5310.7: rev-list (commits)   0.32(0.29+0.03)   0.33(0.30+0.03) +3.1%
  5310.8: rev-list (objects)   1.48(1.44+0.03)   1.49(1.44+0.05) +0.7%

Any improvement there is within the noise (the +3.1% on test 7 has to be
noise, since we are not recursing into trees, and thus the new code
isn't even run). The results for git.git are likewise uninteresting.

But here are numbers from some other real-world repositories (that are
not public). This one's tree is comparable in size to linux.git, but has
~16k refs (and so less complete bitmap coverage):

  Test                         HEAD^               HEAD
  -------------------------------------------------------------------------
  5310.4: simulated clone      38.34(39.86+0.74)   33.95(35.53+0.76) -11.5%
  5310.5: simulated fetch      2.29(6.31+0.35)     2.20(5.97+0.41) -3.9%
  5310.7: rev-list (commits)   0.99(0.86+0.13)     0.96(0.85+0.11) -3.0%
  5310.8: rev-list (objects)   11.32(11.04+0.27)   6.59(6.37+0.21) -41.8%

And here's another with a very large tree (~340k entries), and a fairly
large number of refs (~10k):

  Test                         HEAD^               HEAD
  -------------------------------------------------------------------------
  5310.3: simulated clone      53.83(54.71+1.54)   39.77(40.76+1.50) -26.1%
  5310.4: simulated fetch      19.91(20.11+0.56)   19.79(19.98+0.67) -0.6%
  5310.6: rev-list (commits)   0.54(0.44+0.11)     0.51(0.43+0.07) -5.6%
  5310.7: rev-list (objects)   24.32(23.59+0.73)   9.85(9.49+0.36) -59.5%

This patch provides substantial improvements in these larger cases, and
have any drawbacks for smaller ones (the cost of the bitmap check is
quite small compared to an actual tree traversal).

Note that we have to add a version of revision.c's include_check
callback which handles non-commits. We could possibly consolidate this
into a single callback for all objects types, as there's only one user
of the feature which would need converted (pack-bitmap.c:should_include).
That would in theory let us avoid duplicating any logic. But when I
tried it, the code ended up much worse to read, with lots of repeated
"if it's a commit do this, otherwise do that". Having two separate
callbacks splits that naturally, and matches the existing split of
show_commit/show_object callbacks.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 11:13:11 +09:00
670b81a890 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-14 13:34:28 +09:00
98f3f03bcb Merge branch 'fc/doc-build-cleanup'
Preparatory build procedure clean-up for documentation.

* fc/doc-build-cleanup:
  doc: avoid using rm directly
  doc: simplify Makefile using .DELETE_ON_ERROR
  doc: remove unnecessary rm instances
  doc: improve asciidoc dependencies
  doc: refactor common asciidoc dependencies
2021-06-14 13:33:29 +09:00
2019256717 Merge branch 'ab/test-lib-updates'
Test clean-up.

* ab/test-lib-updates:
  test-lib: split up and deprecate test_create_repo()
  test-lib: do not show advice about init.defaultBranch under --verbose
  test-lib: reformat argument list in test_create_repo()
  submodule tests: use symbolic-ref --short to discover branch name
  test-lib functions: add --printf option to test_commit
  describe tests: convert setup to use test_commit
  test-lib functions: add an --annotated option to "test_commit"
  test-lib-functions: document test_commit --no-tag
  test-lib-functions: reword "test_commit --append" docs
  test-lib tests: remove dead GIT_TEST_FRAMEWORK_SELFTEST variable
  test-lib: bring $remove_trash out of retirement
2021-06-14 13:33:29 +09:00
c189dba20e Merge branch 'dd/honor-users-tar-in-tests'
Test portability fix.

* dd/honor-users-tar-in-tests:
  t: use configured TAR instead of tar
2021-06-14 13:33:28 +09:00
d9d3b76fee Merge branch 'ps/rev-list-object-type-filter'
Message update.

* ps/rev-list-object-type-filter:
  help: fix small typo in error message
2021-06-14 13:33:28 +09:00
ac2158649d Merge branch 'ab/trace2-squelch-gcc-warning'
Workaround compiler warnings.

* ab/trace2-squelch-gcc-warning:
  trace2: refactor to avoid gcc warning under -O3
2021-06-14 13:33:28 +09:00
8e444e66df Merge branch 'so/log-m-implies-p'
The "-m" option in "git log -m" that does not specify which format,
if any, of diff is desired did not have any visible effect; it now
implies some form of diff (by default "--patch") is produced.

* so/log-m-implies-p:
  diff-merges: let "-m" imply "-p"
  diff-merges: rename "combined_imply_patch" to "merges_imply_patch"
  stash list: stop passing "-m" to "git log"
  git-svn: stop passing "-m" to "git rev-list"
  diff-merges: move specific diff-index "-m" handling to diff-index
  t4013: test "git diff-index -m"
  t4013: test "git diff-tree -m"
  t4013: test "git log -m --stat"
  t4013: test "git log -m --raw"
  t4013: test that "-m" alone has no effect in "git log"
2021-06-14 13:33:27 +09:00
169914ede2 Merge branch 'en/ort-perf-batch-11'
Optimize out repeated rename detection in a sequence of mergy
operations.

* en/ort-perf-batch-11:
  merge-ort, diffcore-rename: employ cached renames when possible
  merge-ort: handle interactions of caching and rename/rename(1to1) cases
  merge-ort: add helper functions for using cached renames
  merge-ort: preserve cached renames for the appropriate side
  merge-ort: avoid accidental API mis-use
  merge-ort: add code to check for whether cached renames can be reused
  merge-ort: populate caches of rename detection results
  merge-ort: add data structures for in-memory caching of rename detection
  t6429: testcases for remembering renames
  fast-rebase: write conflict state to working tree, index, and HEAD
  fast-rebase: change assert() to BUG()
  Documentation/technical: describe remembering renames optimization
  t6423: rename file within directory that other side renamed
2021-06-14 13:33:27 +09:00
4dd75a195b Merge branch 'jk/fetch-pack-v2-half-close-early'
"git fetch" over protocol v2 left its side of the socket open after
it finished speaking, which unnecessarily wasted the resource on
the other side.

* jk/fetch-pack-v2-half-close-early:
  fetch-pack: signal v2 server that we are done making requests
2021-06-14 13:33:26 +09:00
0dd2fd18f8 Merge branch 'ds/write-index-with-hashfile-api'
Use the hashfile API in the codepath that writes the index file to
reduce code duplication.

* ds/write-index-with-hashfile-api:
  read-cache: delete unused hashing methods
  read-cache: use hashfile instead of git_hash_ctx
  csum-file.h: increase hashfile buffer size
  hashfile: use write_in_full()
2021-06-14 13:33:26 +09:00
f4f7304b44 Merge branch 'jk/clone-clean-upon-transport-error'
Recent "git clone" left a temporary directory behind when the
transport layer returned an failure.

* jk/clone-clean-upon-transport-error:
  clone: clean up directory after transport_fetch_refs() failure
2021-06-14 13:33:26 +09:00
135997254a Merge branch 'ga/send-email-sendmail-cmd'
"git send-email" learned the "--sendmail-cmd" command line option
and the "sendemail.sendmailCmd" configuration variable, which is a
more sensible approach than the current way of repurposing the
"smtp-server" that is meant to name the server to instead name the
command to talk to the server.

* ga/send-email-sendmail-cmd:
  git-send-email: add option to specify sendmail command
2021-06-14 13:33:26 +09:00
289af16300 Merge branch 'zh/ref-filter-atom-type'
The code to handle the "--format" option in "for-each-ref" and
friends made too many string comparisons on %(atom)s used in the
format string, which has been corrected by converting them into
enum when the format string is parsed.

* zh/ref-filter-atom-type:
  ref-filter: introduce enum atom_type
  ref-filter: add objectsize to used_atom
2021-06-14 13:33:25 +09:00
abcb66c614 *: fix typos which duplicate a word
Fix typos in documentation, code comments, and RelNotes which repeat
various words.  In trivial cases, just delete the duplicated word and
rewrap text, if needed.  Reword the affected sentence in
Documentation/RelNotes/1.8.4.txt for it to make sense.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-14 10:16:06 +09:00
ce24797d38 cmake: add warning for ignored MSGFMT_EXE
It does not make sense to attempt to set MSGFMT_EXE when NO_GETTEXT is
configured, as such add a check for NO_GETTEXT before attempting to set
it.

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-11 15:23:24 +09:00
409047a2b3 cmake: create compile_commands.json by default
Some users have expressed interest in a more "batteries included" way of
building via CMake[1], and a big part of that is providing easier access
to tooling external tools.

A straightforward way to accomplish this is to make it as simple as
possible is to enable the generation of the compile_commands.json file,
which is supported by many tools such as: clang-tidy, clang-format,
sourcetrail, etc.

This does come with a small run-time overhead during the configuration
step (~6 seconds on my machine):

    Time to configure with CMAKE_EXPORT_COMPILE_COMMANDS=TRUE

    real    1m9.840s
    user    0m0.031s
    sys     0m0.031s

    Time to configure with CMAKE_EXPORT_COMPILE_COMMANDS=FALSE

    real    1m3.195s
    user    0m0.015s
    sys     0m0.015s

This seems like a small enough price to pay to make the project more
accessible to newer users.  Additionally there are other large projects
like llvm [2] which has had this enabled by default for >6 years at the
time of this writing, and no real negative consequences that I can find
with my search-skills.

NOTE: That the compile_commands.json is currently produced only when
using the Ninja and Makefile generators.  See The CMake documentation[3]
for more info.

1: https://lore.kernel.org/git/CAOjrSZusMSvs7AS-ZDsV8aQUgsF2ZA754vSDjgFKMRgi_oZAWw@mail.gmail.com/
2: 2c5712051b
3: https://cmake.org/cmake/help/latest/variable/CMAKE_EXPORT_COMPILE_COMMANDS.html

Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-11 15:23:17 +09:00
cd0a852981 cmake: add knob to disable vcpkg
When building on windows users have the option to use vcpkg to provide
the dependencies needed to compile.  Previously, this was used only when
using the Visual Studio generator which was not ideal because:

  - Not all users who want to use vcpkg use the Visual Studio
    generators.

  - Some versions of Visual Studio 2019 moved away from using the
    VS 2019  generator by default, making it impossible for Visual
    Studio to configure the project in the likely event that it couldn't
    find the dependencies.

  - Inexperienced users of CMake are very likely to get tripped up by
    the errors caused by a lack of vcpkg, making the above bullet point
    both annoying and hard to debug.

As such, let's make using vcpkg the default on windows.  Users who want
to avoid using vcpkg can disable it by passing -DNO_VCPKG=TRUE.

Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-11 15:23:00 +09:00
f74d11471f multimail: stop shipping a copy
The multimail project is developed independently and has its own project
page. Traditionally, we shipped a copy in contrib/.

However, such a copy is prone to become stale, and users are much better
served to be directed to the actual project instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-11 13:35:19 +09:00
f96178c529 bulk-checkin: make buffer reuse more obvious and safer
ibuf can be reused for multiple iterations of the loop. Specifically:
deflate() overwrites s.avail_in to show how much of the input buffer
has not been processed yet - and sometimes leaves 'avail_in > 0', in
which case ibuf will be processed again during the loop's subsequent
iteration.

But if we declare ibuf within the loop, then (in theory) we get a new
(and uninitialised) buffer for every iteration. In practice, my compiler
seems to resue the same buffer - meaning that this code does work - but
it doesn't seem safe to rely on this behaviour. MSAN correctly catches
this issue - as soon as we hit the 's.avail_in > 0' condition, we end up
reading from what seems to be uninitialised memory.

Therefore, we move ibuf out of the loop, making this reuse safe.

See MSAN output from t1050-large below - the interesting part is the
ibuf creation at the end, although there's a lot of indirection before
we reach the read from unitialised memory:

==11294==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7f75db58fb1c in crc32_little crc32.c:283:9
    #1 0x7f75db58d5b3 in crc32_z crc32.c:220:20
    #2 0x7f75db59668c in crc32 crc32.c:242:12
    #3 0x8c94f8 in hashwrite csum-file.c:101:15
    #4 0x825faf in stream_to_pack bulk-checkin.c:154:5
    #5 0x82467b in deflate_to_pack bulk-checkin.c:225:8
    #6 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15
    #7 0xa7cff2 in index_stream object-file.c:2234:9
    #8 0xa7bff7 in index_fd object-file.c:2256:9
    #9 0xa7d22d in index_path object-file.c:2274:7
    #10 0xb3c8c9 in add_to_index read-cache.c:802:7
    #11 0xb3e039 in add_file_to_index read-cache.c:835:9
    #12 0x4a99c3 in add_files add.c:458:7
    #13 0x4a7276 in cmd_add add.c:670:18
    #14 0x4a1e76 in run_builtin git.c:461:11
    #15 0x49e1e7 in handle_builtin git.c:714:3
    #16 0x4a0c08 in run_argv git.c:781:4
    #17 0x49d5a8 in cmd_main git.c:912:19
    #18 0x7974da in main common-main.c:52:11
    #19 0x7f75da66f349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #20 0x421bd9 in _start start.S:120

  Uninitialized value was stored to memory at
    #0 0x7f75db58fa6b in crc32_little crc32.c:283:9
    #1 0x7f75db58d5b3 in crc32_z crc32.c:220:20
    #2 0x7f75db59668c in crc32 crc32.c:242:12
    #3 0x8c94f8 in hashwrite csum-file.c:101:15
    #4 0x825faf in stream_to_pack bulk-checkin.c:154:5
    #5 0x82467b in deflate_to_pack bulk-checkin.c:225:8
    #6 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15
    #7 0xa7cff2 in index_stream object-file.c:2234:9
    #8 0xa7bff7 in index_fd object-file.c:2256:9
    #9 0xa7d22d in index_path object-file.c:2274:7
    #10 0xb3c8c9 in add_to_index read-cache.c:802:7
    #11 0xb3e039 in add_file_to_index read-cache.c:835:9
    #12 0x4a99c3 in add_files add.c:458:7
    #13 0x4a7276 in cmd_add add.c:670:18
    #14 0x4a1e76 in run_builtin git.c:461:11
    #15 0x49e1e7 in handle_builtin git.c:714:3
    #16 0x4a0c08 in run_argv git.c:781:4
    #17 0x49d5a8 in cmd_main git.c:912:19
    #18 0x7974da in main common-main.c:52:11
    #19 0x7f75da66f349 in __libc_start_main (/lib64/libc.so.6+0x24349)

  Uninitialized value was stored to memory at
    #0 0x447eb9 in __msan_memcpy msan_interceptors.cpp:1558:3
    #1 0x7f75db5c2011 in flush_pending deflate.c:746:5
    #2 0x7f75db5cafa0 in deflate_stored deflate.c:1815:9
    #3 0x7f75db5bb7d2 in deflate deflate.c:1005:34
    #4 0xd80b7f in git_deflate zlib.c:244:12
    #5 0x825dff in stream_to_pack bulk-checkin.c:140:12
    #6 0x82467b in deflate_to_pack bulk-checkin.c:225:8
    #7 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15
    #8 0xa7cff2 in index_stream object-file.c:2234:9
    #9 0xa7bff7 in index_fd object-file.c:2256:9
    #10 0xa7d22d in index_path object-file.c:2274:7
    #11 0xb3c8c9 in add_to_index read-cache.c:802:7
    #12 0xb3e039 in add_file_to_index read-cache.c:835:9
    #13 0x4a99c3 in add_files add.c:458:7
    #14 0x4a7276 in cmd_add add.c:670:18
    #15 0x4a1e76 in run_builtin git.c:461:11
    #16 0x49e1e7 in handle_builtin git.c:714:3
    #17 0x4a0c08 in run_argv git.c:781:4
    #18 0x49d5a8 in cmd_main git.c:912:19
    #19 0x7974da in main common-main.c:52:11

  Uninitialized value was stored to memory at
    #0 0x447eb9 in __msan_memcpy msan_interceptors.cpp:1558:3
    #1 0x7f75db644241 in _tr_stored_block trees.c:873:5
    #2 0x7f75db5cad7c in deflate_stored deflate.c:1813:9
    #3 0x7f75db5bb7d2 in deflate deflate.c:1005:34
    #4 0xd80b7f in git_deflate zlib.c:244:12
    #5 0x825dff in stream_to_pack bulk-checkin.c:140:12
    #6 0x82467b in deflate_to_pack bulk-checkin.c:225:8
    #7 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15
    #8 0xa7cff2 in index_stream object-file.c:2234:9
    #9 0xa7bff7 in index_fd object-file.c:2256:9
    #10 0xa7d22d in index_path object-file.c:2274:7
    #11 0xb3c8c9 in add_to_index read-cache.c:802:7
    #12 0xb3e039 in add_file_to_index read-cache.c:835:9
    #13 0x4a99c3 in add_files add.c:458:7
    #14 0x4a7276 in cmd_add add.c:670:18
    #15 0x4a1e76 in run_builtin git.c:461:11
    #16 0x49e1e7 in handle_builtin git.c:714:3
    #17 0x4a0c08 in run_argv git.c:781:4
    #18 0x49d5a8 in cmd_main git.c:912:19
    #19 0x7974da in main common-main.c:52:11

  Uninitialized value was stored to memory at
    #0 0x447eb9 in __msan_memcpy msan_interceptors.cpp:1558:3
    #1 0x7f75db5c8fcf in deflate_stored deflate.c:1783:9
    #2 0x7f75db5bb7d2 in deflate deflate.c:1005:34
    #3 0xd80b7f in git_deflate zlib.c:244:12
    #4 0x825dff in stream_to_pack bulk-checkin.c:140:12
    #5 0x82467b in deflate_to_pack bulk-checkin.c:225:8
    #6 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15
    #7 0xa7cff2 in index_stream object-file.c:2234:9
    #8 0xa7bff7 in index_fd object-file.c:2256:9
    #9 0xa7d22d in index_path object-file.c:2274:7
    #10 0xb3c8c9 in add_to_index read-cache.c:802:7
    #11 0xb3e039 in add_file_to_index read-cache.c:835:9
    #12 0x4a99c3 in add_files add.c:458:7
    #13 0x4a7276 in cmd_add add.c:670:18
    #14 0x4a1e76 in run_builtin git.c:461:11
    #15 0x49e1e7 in handle_builtin git.c:714:3
    #16 0x4a0c08 in run_argv git.c:781:4
    #17 0x49d5a8 in cmd_main git.c:912:19
    #18 0x7974da in main common-main.c:52:11
    #19 0x7f75da66f349 in __libc_start_main (/lib64/libc.so.6+0x24349)

  Uninitialized value was stored to memory at
    #0 0x447eb9 in __msan_memcpy msan_interceptors.cpp:1558:3
    #1 0x7f75db5ea545 in read_buf deflate.c:1181:5
    #2 0x7f75db5c97f7 in deflate_stored deflate.c:1791:9
    #3 0x7f75db5bb7d2 in deflate deflate.c:1005:34
    #4 0xd80b7f in git_deflate zlib.c:244:12
    #5 0x825dff in stream_to_pack bulk-checkin.c:140:12
    #6 0x82467b in deflate_to_pack bulk-checkin.c:225:8
    #7 0x823ff1 in index_bulk_checkin bulk-checkin.c:264:15
    #8 0xa7cff2 in index_stream object-file.c:2234:9
    #9 0xa7bff7 in index_fd object-file.c:2256:9
    #10 0xa7d22d in index_path object-file.c:2274:7
    #11 0xb3c8c9 in add_to_index read-cache.c:802:7
    #12 0xb3e039 in add_file_to_index read-cache.c:835:9
    #13 0x4a99c3 in add_files add.c:458:7
    #14 0x4a7276 in cmd_add add.c:670:18
    #15 0x4a1e76 in run_builtin git.c:461:11
    #16 0x49e1e7 in handle_builtin git.c:714:3
    #17 0x4a0c08 in run_argv git.c:781:4
    #18 0x49d5a8 in cmd_main git.c:912:19
    #19 0x7974da in main common-main.c:52:11

  Uninitialized value was created by an allocation of 'ibuf' in the stack frame of function 'stream_to_pack'
    #0 0x825710 in stream_to_pack bulk-checkin.c:101

SUMMARY: MemorySanitizer: use-of-uninitialized-value crc32.c:283:9 in crc32_little
Exiting

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-11 13:01:22 +09:00
1d72b604ef add_pending_object_with_path(): work around "gcc -O3" complaint
When compiling with -O3, some gcc versions (10.2.1 here) complain about
an out-of-bounds subscript:

  revision.c: In function ‘do_add_index_objects_to_pending’:
  revision.c:321:22: error: array subscript [1, 2147483647] is outside array bounds of ‘char[1]’ [-Werror=array-bounds]
    321 |   if (0 < len && name[len] && buf.len)
        |                  ~~~~^~~~~

The "len" parameter here comes from calling interpret_branch_name(),
which intends to return the number of characters of "name" it parsed.

But the compiler doesn't realize this. It knows the size of the empty
string "name" passed in from do_add_index_objects_to_pending(), but it
has no clue that the "len" we get back will be constrained to "0" in
that case.

And I don't think the warning is telling us about some subtle or clever
bug. The implementation of interpret_branch_name() is in another file
entirely, and the compiler can't see it (you can even verify there is no
clever LTO going on by replacing it with "return 0" and still getting
the warning).

We can work around this by replacing our "did we hit the trailing NUL"
subscript dereference with a length check. We do not even have to pay
the cost for an extra strlen(), as we can pass our new length into
interpret_branch_name(), which was converting our "0" into a call to
strlen() anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-11 12:45:37 +09:00
382b601acd ll_union_merge(): rename path_unused parameter
The "path" parameter to ll_union_merge() is named "path_unused", since
we don't ourselves use it. But we do pass it to ll_xdl_merge(), which
may look at it (it gets passed to ll_binary_merge(), which may pass it
to warning()). Let's rename it to correct this inaccuracy (both of the
other functions correctly do not call this "unused").

Note that we also pass drv_unused, but it truly is unused by the rest of
the stack (it only exists at all to provide a generic interface that
matches what ll_ext_merge() needs).

Reported-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-11 12:37:33 +09:00
7f53f78b04 ll_union_merge(): pass name labels to ll_xdl_merge()
Since cd1d61c44f (make union merge an xdl merge favor, 2010-03-01), we
pass NULL to ll_xdl_merge() for the "name" labels of the ancestor, ours
and theirs buffers. We usually use these for annotating conflict markers
left in a file. For a union merge, these shouldn't matter; the point of
it is that we'd never leave conflict markers in the first place.

But there is one code path where we may dereference them: if the file
contents appear to be binary, ll_binary_merge() will give up and pass
them to warning() to generate a message for the user (that was true even
when cd1d61c44f was written, though the warning was in ll_xdl_merge()
back then).

That can result in a segfault, though on many systems (including glibc),
the printf routines will helpfully just say "(null)" instead. We can
extend our binary-union test in t6406 to check stderr, which catches the
problem on all systems.

This also fixes a warning from "gcc -O3". Unlike lower optimization
levels, it inlines enough to see that the NULL can make it to warning()
and complains:

  In function ‘ll_binary_merge’,
      inlined from ‘ll_xdl_merge’ at ll-merge.c:115:10,
      inlined from ‘ll_union_merge’ at ll-merge.c:151:9:
  ll-merge.c:74:4: warning: ‘%s’ directive argument is null [-Wformat-overflow=]
     74 |    warning("Cannot merge binary files: %s (%s vs. %s)",
        |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     75 |     path, name1, name2);
        |     ~~~~~~~~~~~~~~~~~~~

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-11 12:37:07 +09:00
7d879ad7e0 ll_binary_merge(): handle XDL_MERGE_FAVOR_UNION
Prior to commit a944af1d86 (merge: teach -Xours/-Xtheirs to binary
ll-merge driver, 2012-09-08), we always reported a conflict from
ll_binary_merge() by returning "1" (in the xdl_merge and ll_merge code,
this value is the number of conflict hunks). After that commit, we
report zero conflicts if the "variant" flag is set, under the assumption
that it is one of XDL_MERGE_FAVOR_OURS or XDL_MERGE_FAVOR_THEIRS.

But this gets confused by XDL_MERGE_FAVOR_UNION. We do not know how to
do a binary union merge, but erroneously report no conflicts anyway (and
just blindly use the "ours" content as the result).

Let's tighten our check to just the cases that a944af1d86 meant to
cover. This fixes the union case (which existed already back when that
commit was made), as well as future-proofing us against any other
variants that get added later.

Note that you can't trigger this from "git merge-file --union", as that
bails on binary files before even calling into the ll-merge machinery.
The test here uses the "union" merge attribute, which does erroneously
report a successful merge.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-11 12:37:04 +09:00
211eca0895 The first batch post Git 2.32
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-10 12:04:27 +09:00
ccf0378905 Merge branch 'ah/setup-extensions-message-i18n-fix'
Message update.

* ah/setup-extensions-message-i18n-fix:
  setup: split "extensions found" messages into singular and plural
2021-06-10 12:04:27 +09:00
3153c83c77 Merge branch 'ah/fetch-reject-warning-grammofix'
Message update.

* ah/fetch-reject-warning-grammofix:
  fetch: improve grammar of "shallow roots" message
2021-06-10 12:04:27 +09:00
7ce7a617b9 Merge branch 'jk/doc-color-pager'
The documentation for "color.pager" configuration variable has been
updated.

* jk/doc-color-pager:
  doc: explain the use of color.pager
2021-06-10 12:04:26 +09:00
8e1d2fc0cc Merge branch 'tl/fix-packfile-uri-doc'
Doc fix.

* tl/fix-packfile-uri-doc:
  packfile-uri.txt: fix blobPackfileUri description
2021-06-10 12:04:26 +09:00
7f06d94e72 Merge branch 'ry/clarify-fast-forward-in-glossary'
The description of "fast-forward" in the glossary has been updated.

* ry/clarify-fast-forward-in-glossary:
  docs: improve fast-forward in glossary content
2021-06-10 12:04:26 +09:00
e4b5d2a83f Merge branch 'wm/rev-parse-die-i18n'
Quite a many die() messages in rev-parse haven't been marked for
translation.

* wm/rev-parse-die-i18n:
  rev-parse: mark die() messages for translation
2021-06-10 12:04:25 +09:00
b009fd41e8 Merge branch 'jc/clarify-revision-range'
Doc update.

* jc/clarify-revision-range:
  revisions(7): clarify that most commands take a single revision range
2021-06-10 12:04:25 +09:00
d8c6dc2a5a Merge branch 'ah/doc-describe'
Doc update.

* ah/doc-describe:
  describe-doc: clarify default length of abbreviation
2021-06-10 12:04:24 +09:00
f44416c823 Merge branch 'ah/submodule-helper-module-summary-parseopt'
Message update.

* ah/submodule-helper-module-summary-parseopt:
  submodule: use the imperative mood to describe the --files option
2021-06-10 12:04:24 +09:00
ce885c5342 Merge branch 'ah/stash-usage-i18n-fix'
i18n update.

* ah/stash-usage-i18n-fix:
  stash: don't translate literal commands
2021-06-10 12:04:24 +09:00
b03709eae7 Merge branch 'ah/merge-usage-i18n-fix'
i18n update.

* ah/merge-usage-i18n-fix:
  merge: don't translate literal commands
2021-06-10 12:04:23 +09:00
d6e35a2644 Merge branch 'jn/size-t-casted-to-off-t-fix'
Rewrite code that triggers undefined behaiour warning.

* jn/size-t-casted-to-off-t-fix:
  xsize_t: avoid implementation defined behavior when len < 0
2021-06-10 12:04:23 +09:00
bb6a63a4e5 Merge branch 'mt/parallel-checkout-with-padded-oidcpy'
The parallel checkout codepath did not initialize object ID field
used to talk to the worker processes in a futureproof way.

* mt/parallel-checkout-with-padded-oidcpy:
  parallel-checkout: send the new object_id algo field to the workers
2021-06-10 12:04:22 +09:00
26b25e03b2 Merge branch 'ef/mailinfo-short-name'
We historically rejected a very short string as an author name
while accepting a patch e-mail, which has been loosened.

* ef/mailinfo-short-name:
  mailinfo: don't discard names under 3 characters
2021-06-10 12:04:22 +09:00
a45e390ad6 gitweb: use HEAD as secondary sort key in git_get_heads_list()
The "heads" section on the gitweb summary page shows heads in
`-committerdate` order (ie. the most recently-modified ones at the
top), tie-breaking equal-dated refs using the implicit `refname` sort
fallback. This recency-based ordering appears in multiple places in the
UI, such as the project listing, the tags list, and even the
shortlog and log views.

Given two equal-dated refs, however, sorting the `HEAD` ref before
the non-`HEAD` ref provides more useful signal than merely sorting by
refname. For example, say we had "master" and "trunk" both pointing at
the same commit but "trunk" was `HEAD`, sorting "trunk" first helps
communicate its special status as the default branch that you'll check
out if you clone the repo.

Add `-HEAD` as a secondary sort key to the `git for-each-ref` call
in `git_get_heads_list()` to provide the desired behavior. The most
recently committed refs will appear first, but `HEAD`-ness will be used
as a tie-breaker. Note that `refname` is the implicit fallback sort key,
which means that two same-dated non-`HEAD` refs will continue to be
sorted in lexicographical order, as they are today.

Signed-off-by: Greg Hurrell <greg@hurrell.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-10 09:49:20 +09:00
ef68c3d800 merge-ort: miscellaneous touch-ups
Add some notes in the code about invariants with match_mask when adding
pairs.  Also add a comment that seems to have been left out in my work
of pushing these changes upstream.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 11:40:04 +09:00
356da0f98b Fix various issues found in comments
A random hodge-podge of incorrect or out-of-date comments that I found:

  * t6423 had a comment that has referred to the wrong test for years;
    fix it to refer to the right one.
  * diffcore-rename had a FIXME comment meant to remind myself to
    investigate if I could make another code change.  I later
    investigated and removed the FIXME, but while cherry-picking the
    patch to submit upstream I missed the later update.  Remove the
    comment now.
  * merge-ort had the early part of a comment for a function; I had
    meant to include the more involved description when I updated the
    function.  Update the comment now.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 11:40:04 +09:00
61bf4490af diffcore-rename: avoid unnecessary strdup'ing in break_idx
The keys of break_idx are strings from the diff_filepairs of
diff_queued_diff.  break_idx is only used in location_rename_dst(), and
that usage is always before any free'ing of the pairs (and thus the
strings in the pairs).  As such, there is no need to strdup these keys;
we can just reuse the existing strings as-is.

The merge logic doesn't make use of break detection, so this does not
affect the performance of any of my testcases.  It was just a minor
unrelated optimization noted in passing while looking at the code.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 11:40:04 +09:00
5a3743da32 merge-ort: replace string_list_df_name_compare with faster alternative
Gathering accumulated times from trace2 output on the mega-renames
testcase, I saw the following timings (where I'm only showing a few
lines to highlight the portions of interest):

    10.120 : label:incore_nonrecursive
        4.462 : ..label:process_entries
           3.143 : ....label:process_entries setup
              2.988 : ......label:plist special sort
           1.305 : ....label:processing
        2.604 : ..label:collect_merge_info
        2.018 : ..label:merge_start
        1.018 : ..label:renames

In the above output, note that the 4.462 seconds for process_entries was
split as 3.143 seconds for "process_entries setup" and 1.305 seconds for
"processing" (and a little time for other stuff removed from the
highlight).  Most of the "process_entries setup" time was spent on
"plist special sort" which corresponds to the following code:

    trace2_region_enter("merge", "plist special sort", opt->repo);
    plist.cmp = string_list_df_name_compare;
    string_list_sort(&plist);
    trace2_region_leave("merge", "plist special sort", opt->repo);

In other words, in a merge strategy that would be invoked by passing
"-sort" to either rebase or merge, sorting an array takes more time than
anything else.  Serves me right for naming my merge strategy this way.

Rewrite the comparison function in a way that does not require finding
out the lengths of the strings when comparing them.  While at it, tweak
the code for our specific case -- no need to handle a variety of modes,
for example.  The combination of these changes reduced the time spent in
"plist special sort" by ~25% in the mega-renames case.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        5.622 s ±  0.059 s     5.235 s ±  0.042 s
    mega-renames:     10.127 s ±  0.073 s     9.419 s ±  0.107 s
    just-one-mega:   500.3  ms ±  3.8  ms   480.1  ms ±  3.9  ms

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 11:40:03 +09:00
4184cbd635 mailinfo: use starts_with() when checking scissors
Existing checks for scissors characters using memcmp(3) never read past
the end of the line, because all substrings we are interested in are two
characters long, and the outer loop guarantees we have at least one
character.  So at most we will look at the NUL.

However, this is too subtle and may lead to bugs in code which copies
this behavior without realizing substring length requirement.  So use
starts_with() instead, which will stop at NUL regardless of the length
of the prefix.  Remove extra pair of parentheses while we are here.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 11:13:07 +09:00
91d2347033 MyFirstContribution: link #git-devel to Libera Chat
Many of the regulars on #git-devel are now on Libera Chat, to the extent
that the community page now lists it as the IRC Channel[1]. This will
help new contributors find the right place, if they choose to ask
questions on `#git-devel`.

Relevant discussion on the IRC transition:
https://lore.kernel.org/git/CAJoAoZ=e62sceNpcR5L5zjsj177uczTnXjcAg+BbOoOkeH8vXQ@mail.gmail.com/

[1] https://git-scm.com/community

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 09:22:54 +09:00
338abb0f04 builtins + test helpers: use return instead of exit() in cmd_*
Change various cmd_* functions that claim to return an "int" to use
"return" instead of exit() to indicate an exit code. These were not
marked with NORETURN, and by directly exit()-ing we'll skip the
cleanup git.c would otherwise do (e.g. closing fd's, erroring if we
can't). See run_builtin() in git.c.

In the case of shell.c and sh-i18n--envsubst.c this was the result of
an incomplete migration to using a cmd_main() in 3f2e2297b9 (add an
extra level of indirection to main(), 2016-07-01).

This was spotted by SunCC 12.5 on Solaris 10 (gcc210 on the gccfarm).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 09:15:58 +09:00
6fb9195f6c doc: warn people against --max-pack-size
This option is almost never a good idea, as the resulting repository is
larger and slower (see the new explanations in the docs).

I outlined the potential problems. We could go further and make the
option harder to find (or at least, make the command-line option
descriptions a much more terse "you probably don't want this; see
pack.packsizeLimit for details"). But this seems like a minimal change
that may prevent people from thinking it's more useful than it is.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 08:56:09 +09:00
482c962de4 t: use user-specified utf-8 locale for testing svn
In some test-cases, UTF-8 locale is required. To find such locale,
we're using the first available UTF-8 locale that returned by
"locale -a".

However, the locale(1) utility is unavailable on some systems,
e.g. Linux with musl libc.

However, without "locale -a", we can't guess provided UTF-8 locale.

Add a Makefile knob GIT_TEST_UTF8_LOCALE and activate it for
linux-musl in our CI system.

Rename t/lib-git-svn.sh:prepare_a_utf8_locale to prepare_utf8_locale,
since we no longer prepare the variable named "a_utf8_locale",
but set up a fallback value for GIT_TEST_UTF8_LOCALE instead.
The fallback will be LC_ALL, LANG environment variable,
or the first UTF-8 locale from output of "locale -a", in that order.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-08 16:07:37 +09:00
8603c419d3 doc: merge: mention default of defaulttoupstream
Commit a01f7f2ba0 (merge: enable defaulttoupstream by default,
2014-04-20) forgot to mention the new default in the configuration
documentation.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-08 14:24:51 +09:00
a0538e5c8b doc/log: correct default for --decorate
There're two different default options for log --decorate:
* Should `--decorate` be given without any arguments, it's default to
  `short`
* Should neither `--decorate` nor `--no-decorate` be given, it's default
  to the `log.decorate` or `auto`.

We documented the former, but not the latter.

Let's document them, too.

Reported-by: Andy AO <zen96285@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-08 13:34:48 +09:00
47eb4c6890 mergetools/kdiff3: make kdiff3 work on Windows too
The native kdiff3 mergetool is not found by git mergetool on
Windows.  The message "The merge tool kdiff3 is not available as
'kdiff3'" is displayed.

Just like we translate the name of the binary and look for it on the
search path for WinMerge, do the same for kdiff3 to find it.

Signed-off-by: Michael Schindler michael@compressconsult.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-08 10:26:16 +09:00
546096a5cb xdiff: use BUG(...), not xdl_bug(...)
The xdl_bug() function was introduced in
e8adf23d1e (xdl_change_compact(): introduce the concept of a change
group, 2016-08-22), let's use our usual BUG() function instead.

We'll now have meaningful line numbers if we encounter bugs in xdiff,
and less code duplication.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-08 10:09:28 +09:00
b7b793d1e7 read-cache.c: don't guard calls to progress.c API
Don't guard the calls to the progress.c API with "if (progress)". The
API itself will check this. This doesn't change any behavior, but
makes this code consistent with the rest of the codebase.

See ae9af12287 (status: show progress bar if refreshing the index
takes too long, 2018-09-15) for the commit that added the pattern
we're changing here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-08 10:09:13 +09:00
d94f9b8e90 protocol-caps.h: add newline at end of file
Add a trailing newline to the protocol-caps.h file added in the recent
a2ba162cda (object-info: support for retrieving object info,
2021-04-20). Various editors add this implicitly, and some compilers
warn about the lack of a \n here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-08 10:08:41 +09:00
52ff891c03 t: fix whitespace around &&
Add missing spaces before '&&' and switch tabs around '&&' to spaces.

These issues were found using `git grep '[^ ]&&$'` and
`git grep -P '&&\t'`.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-08 10:08:01 +09:00
7b4d495439 l10n: zh_CN: review for git v2.32.0 l10n round 1
Reviewed-by: 依云 <lilydjwg@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-07 19:32:32 +08:00
ebf3c04b26 Git 2.32
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-06 15:40:01 +09:00
15664a5f35 Merge tag 'l10n-2.32.0-rnd1.1' of git://github.com/git-l10n/git-po
l10n-2.32.0-rnd1.1

* tag 'l10n-2.32.0-rnd1.1' of git://github.com/git-l10n/git-po: (25 commits)
  l10n: es: 2.32.0 round 1
  l10n: zh_CN: for git v2.32.0 l10n round 1
  l10n: Update Catalan translation
  l10n: de.po: Update German translation for Git v2.32.0
  l10n: README: note on fuzzy translations
  l10n: README: document l10n conventions
  l10n: README: document "core translation"
  l10n: README: document git-po-helper
  l10n: README: add file extention ".md"
  l10n: pt_PT: add Portuguese translations part 3
  l10n: bg.po: Updated Bulgarian translation (5204t)
  l10n: id: po-id for 2.32.0 (round 1)
  l10n: vi.po(5204t): Updated Vietnamese translation for v2.32.0
  l10n: zh_TW.po: localized
  l10n: zh_TW.po: v2.32.0 round 1 (11 untranslated)
  l10n: sv.po: Update Swedish translation (5204t0f0u)
  l10n: fix typos in po/TEAMS
  l10n: fr: v2.32.0 round 1
  l10n: tr: v2.32.0-r1
  l10n: fr: fixed inconsistencies
  ...
2021-06-06 15:39:21 +09:00
0d3505e286 Merge branch 'rs/parallel-checkout-test-fix'
Test fix.

* rs/parallel-checkout-test-fix:
  parallel-checkout: avoid dash local bug in tests
2021-06-06 15:39:10 +09:00
0481af98ba Merge branch 'jc/fsync-can-fail-with-eintr'
Last minute portability fix.

* jc/fsync-can-fail-with-eintr:
  fsync(): be prepared to see EINTR
2021-06-06 15:39:10 +09:00
ebee5580ca parallel-checkout: avoid dash local bug in tests
Dash bug https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097
lets the shell erroneously perform field splitting on the expansion of a
command substitution during declaration of a local variable.  It causes
the parallel-checkout tests to fail e.g. when running them with
/bin/dash on MacOS 11.4, where they error out like this:

   ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name

That's because the output of wc -l contains leading spaces and the
returned number of lines is treated as another variable to declare, i.e.
as in "local workers= 0".

Work around it by enclosing the command substitution in quotes.

Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-06 10:40:26 +09:00
8e02217e10 l10n: es: 2.32.0 round 1
Signed-off-by: Christopher Diaz Riveros <christopher.diaz.riv@gmail.com>
2021-06-05 20:06:23 -05:00
33b62fba4d l10n: zh_CN: for git v2.32.0 l10n round 1
Translate 126 new messages (5204t0f0u) for git 2.32.0.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-05 22:45:18 +08:00
de65c76e55 Merge branch 'fix_typo' of github.com:e-yes/git
* 'fix_typo' of github.com:e-yes/git:
  l10n: ru.po: fix typo in Russian translation
2021-06-05 21:30:30 +08:00
cccdfd2243 fsync(): be prepared to see EINTR
Some platforms, like NonStop do not automatically restart fsync()
when interrupted by a signal, even when that signal is setup with
SA_RESTART.

This can lead to test breakage, e.g., where "--progress" is used,
thus SIGALRM is sent often, and can interrupt an fsync() syscall.

Make sure we deal with such a case by retrying the syscall
ourselves.  Luckily, we call fsync() fron a single wrapper,
fsync_or_die(), so the fix is fairly isolated.

Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Taylor Blau <me@ttaylorr.com>
[jc: the above two did most of the work---I just tied the loose end]
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-05 22:13:40 +09:00
d5e7f9632f Merge branch 'pt-PT' of github.com:git-l10n-pt-PT/git-po
* 'pt-PT' of github.com:git-l10n-pt-PT/git-po:
  l10n: pt_PT: add Portuguese translations part 3
  l10n: pt_PT: add Portuguese translations part 2
2021-06-04 18:59:17 +08:00
a2bb98ba76 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-06-04 06:58:05 +02:00
7ba68e0cf1 docs: fix api-trace2 doc for "too_many_files" event
In 87db61a (trace2: write discard message to sentinel files,
2019-10-04), we added a new "too_many_files" event for when trace2
logging would create too many files in an output directory.
Unfortunately, the api-trace2 doc described a "discard" event instead.
Fix the doc to use the correct event name.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-04 12:11:16 +09:00
ace6d8e3d6 Remove warning that repack only works on non-promisor packfiles
The git-repack doc clearly states that it *does* operate on promisor
packfiles (in a separate partition), with "-a" specified. Presumably
the statements here are outdated, as they feature from the first doc
in 2017 (and the repack support was added in 2018)

Signed-off-by: Tao Klerks <tao@klerks.biz>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-04 09:45:47 +09:00
ee02ac6164 cat-file: merge two block into one
There are two "if (opt->all_objects)" blocks next
to each other, merge them into one to provide better
readability.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-04 07:50:26 +09:00
e16acc80a7 cat-file: handle trivial --batch format with --batch-all-objects
The --batch code to print an object assumes we found out the type of
the object from calling oid_object_info_extended(). This is true for
the default format, but even in a custom format, we manually modify
the object_info struct to ask for the type.

This assumption was broken by 845de33a5b (cat-file: avoid noop calls
to sha1_object_info_extended, 2016-05-18). That commit skips the call
to oid_object_info_extended() entirely when --batch-all-objects is in
use, and the custom format does not include any placeholders that
require calling it.

Or when the custom format only include placeholders like %(objectname) or
%(rest), oid_object_info_extended() will not get the type of the object.

This results in an error when we try to confirm that the type didn't
change:

$ git cat-file --batch=batman --batch-all-objects
batman
fatal: object 000023961a changed type!?

and also has other subtle effects (e.g., we'd fail to stream a blob,
since we don't realize it's a blob in the first place).

We can fix this by flipping the order of the setup. The check for "do
we need to get the object info" must come _after_ we've decided
whether we need to look up the type.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-04 07:50:26 +09:00
94d17948af l10n: de.po: Update German translation for Git v2.32.0
Reviewed-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Matthias Rüster <matthias.ruester@gmail.com>
2021-06-02 19:24:10 +02:00
c09b6306c6 Git 2.32-rc3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 12:51:09 +09:00
0b18023d00 contrib/completion: fix zsh completion regression from 59d85a2a05
A recent change to make git-completion.bash use $__git_cmd_idx
in more places broke a number of completions on zsh because it
modified __git_main but did not update __git_zsh_main.

Notably, completions for "add", "branch", "mv" and "push" were
broken as a result of this change.

In addition to the undefined variable usage, "git mv <tab>" also
prints the following error:

	__git_count_arguments:7: bad math expression:
	operand expected at `"1"'

	_git_mv:[:7: unknown condition: -gt

Remove the quotes around $__git_cmd_idx in __git_count_arguments
and set __git_cmd_idx=1 early in __git_zsh_main to fix the
regressions from 59d85a2a05.

This was tested on zsh 5.7.1 (x86_64-apple-darwin19.0).

Suggested-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Acked-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 12:49:40 +09:00
3714fbcb45 l10n: README: note on fuzzy translations
Fuzzy translation problem can occur when updating translations.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
69c13a7880 l10n: README: document l10n conventions
Document the conventions that l10n contributors must follow.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
2fb9d2596f l10n: README: document "core translation"
Contributor for a new language must complete translations of a small set
of l10n messages.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
6d09c53001 l10n: README: document git-po-helper
Document the PO helper program (git-po-helper) with installation and
basic usage.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
e54b271529 l10n: README: add file extention ".md"
Add file extension ".md" to "po/README" to help to display this markdown
file properly.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
7088ce7191 push: don't get a full remote object
All we need to know is that their names are the same.

Additionally this might be easier to parse for some since
remote_for_branch is more descriptive than remote_get(NULL).

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:03 +09:00
e0c91cffde push: only check same_remote when needed
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
c5b09cf771 push: remove trivial function
It's a single line that is used in a single place, and the variable has
the same name as the function.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
1afd78fb5c push: remove redundant check
If fetch_remote is NULL (i.e. the branch remote is invalid), then it
can't possibly be same as remote, which can't be NULL.

The check is redundant, and so is the extra variable.

Also, fix the Yoda condition: we want to check if remote is the same as
the branch remote, not the other way around.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
1f934725f7 push: factor out the typical case
Only override dst on the odd case.

This allows a preemptive break on the `simple` case.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
0add899baf push: get rid of all the setup_push_* functions
Their code is much simpler now and can move into the parent function.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
d371a9ef4c push: trivial simplifications
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
00458dc5f1 push: make setup_push_* return the dst
All of the setup_push_* functions are appending a refspec. Do this only
once on the parent function.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
65c63a0054 push: only get the branch when needed
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
cc16f95d21 push: factor out null branch check
No need to do it in every single function.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
04159fba42 push: split switch cases
We want all the cases that don't do anything with a branch first, and
then the rest. That way we will be able to get the branch and die if
there's a problem in the parent function, instead of inside the function
of each mode.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
72739680fc push: return immediately in trivial switch case
There's no need to break when nothing else will be executed.

Will help next patches.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
533e0325ab push: create new get_upstream_ref() helper
This code is duplicated among multiple functions.

No functional changes.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:00 +09:00
90cfb2666b doc: push: explain default=simple correctly
Now that the code has been simplified and it's clear what it's
actually doing, update the documentation to reflect that.

Namely; the simple mode only barfs when working on a centralized
workflow, and there's no configured upstream branch with the same name.

Cc: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:52 +09:00
7e6d72bb11 push: remove unused code in setup_push_upstream()
Now it's not used for the simple mode.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:52 +09:00
b8e8b98647 push: simplify setup_push_simple()
There's a safety check to make sure branch->refname isn't different
from branch->merge[0]->src, otherwise we die().

Therefore we always push to branch->refname.

Suggestions-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:52 +09:00
6b010c80a2 push: reorganize setup_push_simple()
Simply move the code around and remove dead code. In particular the
'!same_remote' conditional is a no-op since that part of the code is the
same_remote leg of the conditional beforehand.

No functional changes.

Suggestions-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:51 +09:00
d099b9c9c7 push: copy code to setup_push_simple()
In order to avoid doing unnecessary things and simplify it in further
patches. In particular moving the additional name safety out of
setup_push_upstream() and into setup_push_simple() and thus making both
more straightforward.

The code is copied exactly as-is; no functional changes.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:51 +09:00
3b9fd8361f push: hedge code of default=simple
`simple` is the most important mode so move the relevant code to its own
function to make it easier to see what it's doing.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:51 +09:00
050f76b9af push: rename !triangular to same_remote
The typical case is what git was designed for: distributed remotes.

It's only the atypical case--fetching and pushing to the same
remote--that we need to keep an eye on.

No functional changes.

Liked-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:51 +09:00
1231cab341 t1415: set REFFILES for test specific to storage format
Packing refs (and therefore checking that certain refs are not packed)
is a property of the packed/loose ref storage. Add a comment to explain
what the test checks.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:55 +09:00
dc474899e7 t4202: mark bogus head hash test with REFFILES
In reftable, hashes are correctly formed by design.

Split off test for git-log in empty repo.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:55 +09:00
c139e58237 t7003: check reflog existence only for REFFILES
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:55 +09:00
e740873c47 t7900: stop checking for loose refs
Given that git-maintenance simply calls out git-pack-refs, it seems superfluous
to test the functionality of pack-refs itself, as that is covered by
t3210-pack-refs.sh.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:55 +09:00
fe8fc09f34 t1404: mark tests that muck with .git directly as REFFILES.
The packed/loose ref storage is an overlay combination of packed-refs (refs and
tags in a single file) and one-file-per-ref. This creates all kinds of edge
cases related to directory/file conflicts, (non-)empty directories, and the
locking scheme, none of which applies to reftable.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:55 +09:00
a5709636d9 t2017: mark --orphan/logAllRefUpdates=false test as REFFILES
In reftable, there is no notion of a per-ref 'existence' of a reflog. Each
reflog entry has its own key, so it is not possible to distinguish between
{reflog doesn't exist,reflog exists but is empty}. This makes the logic
in log_ref_setup() (file refs/files-backend.c), which depends on the existence
of the reflog file infeasible.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:55 +09:00
41e2e177c7 t1414: mark corruption test with REFFILES
The test checks what happens if reflog and ref database disagree on the state of
the latest commit. This seems to require accessing reflog storage directly.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:55 +09:00
759d02d1ae t1407: require REFFILES for for_each_reflog test
Add extensive comment why this test needs a REFFILES annotation.

I tried forcing universal reflog creation with core.logAllRefUpdates=true, but
that apparently also doesn't cause reflogs to be created for pseudorefs

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:55 +09:00
c305e667e0 test-lib: provide test prereq REFFILES
REFFILES can be used to mark tests that are specific to the packed/loose ref
storage format and its limitations. Marking such tests is a preparation for
introducing the reftable storage backend.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
d491f5ea07 t5304: use "reflog expire --all" to clear the reflog
This test checks that unreachable objects are really removed. For the test to
work, it has to ensure that no reflog retain any reachable objects.

Previously, it did this by manipulating the file system to remove reflog in the
first test, and relying on git not updating the reflog if the relevant logfile
doesn't exist in follow-up tests.

Now, explicitly clear the reflog using 'reflog expire'. This reduces the
dependency between test functions. It also is more amenable to use with
reftable, which has no concept of (non)-existence of a reflog

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
1fa9cf6ea1 t5304: restyle: trim empty lines, drop ':' before >
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
fdc8acc706 t7003: use rev-parse rather than FS inspection
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
f1ed224753 t5000: inspect HEAD using git-rev-parse
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
9c8e7e968c t5000: reformat indentation to the latest fashion
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
0218ad5d5c t1301: fix typo in error message
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
1142746cbb t1413: use tar to save and restore entire .git directory
This makes the test independent of the particulars of the storage formats.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
b1259ecff9 t1401-symbolic-ref: avoid direct filesystem access
Use symbolic-ref and rev-parse to inspect refs.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
9910cbb6f9 t1401: use tar to snapshot and restore repo state
This is agnostic to the ref storage format

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
dd8468ef00 t5601: read HEAD using rev-parse
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
62038c81f3 t9300: check ref existence using test-helper rather than a file system check
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
0221eb8678 t/helper/ref-store: initialize oid in resolve-ref
This will print $ZERO_OID when asking for a non-existent ref from the
test-helper.

Since resolve-ref provides direct access to refs_resolve_ref_unsafe(), it
provides a reliable mechanism for accessing REFNAME, while avoiding the implicit
resolution to refs/heads/REFNAME.

Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
230356ba70 t4202: split testcase for invalid HEAD symref and HEAD hash
Reftable will prohibit invalid hashes at the storage level, but
git-symbolic-ref can still create branches ending in ".lock".

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
ed125c4f07 Merge branch 'ab/fsck-api-cleanup'
Last minute compilation fix.

* ab/fsck-api-cleanup:
  builtin/fsck.c: don't conflate "int" and "enum" in callback
2021-06-02 07:34:27 +09:00
28abf260a5 builtin/fsck.c: don't conflate "int" and "enum" in callback
Fix a warning on AIX's xlc compiler that's been emitted since my
a1aad71601 (fsck.h: use "enum object_type" instead of "int",
2021-03-28):

    "builtin/fsck.c", line 805.32: 1506-068 (W) Operation between
    types "int(*)(struct object*,enum object_type,void*,struct
    fsck_options*)" and "int(*)(struct object*,int,void*,struct
    fsck_options*)" is not allowed.

I.e. it complains about us assigning a function with a prototype "int"
where we're expecting "enum object_type".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 05:59:15 +09:00
ed1e674f4d l10n: pt_PT: add Portuguese translations part 3
* Correct malformed strings
* Transforming 'não' (no) into affirmative

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-06-01 11:45:52 +01:00
d1e7c2cac9 completion: add --anchored to diff's options
This flag was introduced in 2477ab2e (diff: support anchoring line(s),
2017-11-27) but back then, the bash completion script did not learn
about the new flag. Add it.

Signed-off-by: Thomas Braun <thomas.braun@virtuell-zuhause.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-31 06:21:14 +09:00
3d33e36c47 Merge branch 'l10n/zh_TW/21-05-20' of github.com:l10n-tw/git-po
* 'l10n/zh_TW/21-05-20' of github.com:l10n-tw/git-po:
  l10n: zh_TW.po: localized
  l10n: zh_TW.po: v2.32.0 round 1 (11 untranslated)
2021-05-30 21:40:59 +08:00
e94005634c Merge branch 'master' of github.com:Softcatala/git-po
* 'master' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2021-05-30 20:45:10 +08:00
fe1c18ba4d l10n: bg.po: Updated Bulgarian translation (5204t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2021-05-28 17:45:58 +02:00
c95e3a3f0b send-email: move trivial config handling to Perl
Optimize the startup time of git-send-email by using an amended
config_regexp() function to retrieve the list of config keys and
values we're interested in.

For boolean keys we can handle the [true|false] case ourselves, and
the "--get" case didn't need any parsing. Let's leave "--path" and
other "--bool" cases to "git config". I'm not bothering with the
"undef" or "" case (true and false, respectively), let's just punt on
those and others and have "git config --type=bool" handle it.

The "grep { defined } @values" here covers a rather subtle case. For
list values such as sendemail.to it is possible as with any other
config key to provide a plain "-c sendemail.to", i.e. to set the key
as a boolean true. In that case the Git::config() API will return an
empty string, but this new parser will correctly return "undef".

However, that means we can end up with "undef" in the middle of a
list. E.g. for sendemail.smtpserveroption in conjuction with
sendemail.smtpserver as a path this would have produce a warning. For
most of the other keys we'd behave the same despite the subtle change
in the value, e.g. sendemail.to would behave the same because
Mail::Address->parse() happens to return an empty list if fed
"undef". For the boolean values we were already prepared to handle
these variables being initialized as undef anyway.

This brings the runtime of "git send-email" from ~60-~70ms to a very
steady ~40ms on my test box. We now run just one "git config"
invocation on startup instead of 8, the exact number will differ based
on the local sendemail.* config. I happen to have 8 of those set.

This brings the runtime of t9001-send-email.sh from ~13s down to ~12s
for me. The change there is less impressive as many of those tests set
various config values, and we're also getting to the point of
diminishing returns for optimizing "git send-email" itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
17530b2ed2 perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd()
It has been pointed out[1] that cwd() invokes "pwd(1)" while getcwd()
is a Perl-native XS function. For what we're using these for we can
use getcwd().

The performance difference is miniscule, we're saving on the order of
a millisecond or so, see [2] below for the benchmark. I don't think
this matters in practice for optimizing git-send-email or perl
execution (unlike the patches leading up to this one).

But let's do it regardless of that, if only so we don't have to think
about this as a low-hanging fruit anymore.

1. https://lore.kernel.org/git/20210512180517.GA11354@dcvr/
2.
    $ perl -MBenchmark=:all -MCwd -wE 'cmpthese(10000, { getcwd => sub { getcwd }, cwd => sub { cwd }, pwd => sub { system "pwd >/dev/null" }})'
                (warning: too few iterations for a reliable count)
                             Rate                  pwd                 cwd    getcwd
    pwd                     982/s                   --                -48%     -100%
    cwd                    1890/s                  92%                  --     -100%
    getcwd 10000000000000000000/s 1018000000000000000% 529000000000000064%        -

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
5a544a4e11 perl: lazily load some common Git.pm setup code
Instead of unconditionally requiring modules such as File::Spec, let's
only load them when needed. This speeds up code that only needs a
subset of the features Git.pm provides.

This brings a plain invocation of "git send-email" down from 52/37
loaded modules under NO_GETTEXT=[|Y] to 39/18, and it now takes
~60-~70ms instead of ~80-~90ms. The runtime of t9001-send-email.sh
test is down to ~13s from ~15s.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
f4dc9432fd send-email: lazily load modules for a big speedup
Optimize the time git-send-email takes to do even the simplest of
things (such as serving up "-h") from around ~150ms to ~80ms-~90ms by
lazily loading the modules it requires.

Before this change Devel::TraceUse would report 99/97 used modules
under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s
to run t9001-send-email.sh, down from ~20s.

Changing File::Spec::Functions::{catdir,catfile} to invoking class
methods on File::Spec itself is idiomatic. See [1] for a more
elaborate explanation, the resulting code behaves the same way, just
without the now-pointless function wrapper.

1. http://lore.kernel.org/git/8735u8mmj9.fsf@evledraar.gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
447ed29c0d send-email: get rid of indirect object syntax
Change indirect object syntax such as "new X ARGS" to
"X->new(ARGS)". This allows perl to see what "new" is at compile-time
without having loaded Term::ReadLine. This doesn't matter now, but
will in a subsequent commit when we start lazily loading it.

Let's do the same for the adjacent "FakeTerm" package for consistency,
even though we're not going to conditionally load it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
4adbf387bf send-email: use function syntax instead of barewords
Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler
doesn't have to guess that "__" is a function. This makes the code
more readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
fef381e6dc send-email: lazily shell out to "git var"
Optimize git-send-email by only shelling out to "git var" if we need
to. This is easily done by re-inventing our own small version of
perl's Memoize module.

I suppose I could just use Memoize itself, but in a subsequent patch
I'll be micro-optimizing send-email's use of dependencies. Using
Memoize is a measly extra 5-10 milliseconds, but as we'll see that'll
end up mattering for us in the end.

This brings the runtime of a plain "send-email" from around ~160-170ms
to ~140m-150ms. The runtime of the tests is around the same, or around
~20s.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
9264d29bf6 send-email: lazily load config for a big speedup
Reduce the time it takes git-send-email to get to even the most
trivial of tasks (such as serving up its "-h" output) by first listing
config keys that exist, and only then only call e.g. "git config
--bool" on them if they do.

Over a lot of runs this speeds the time to "-h" up for me from ~250ms
to ~150ms, and the runtime of t9001-send-email.sh goes from ~25s to
~20s.

This introduces a race condition where we'll do the "wrong" thing if a
config key were to be inserted between us discovering the list and
calling read_config(), i.e. we won't know about the racily added
key. In theory this is a change in behavior, in practice it doesn't
matter.

The config_regexp() function being changed here was added in
dd84e528a3 (git-send-email: die if sendmail.* config is set,
2020-07-23) for use by git-send-email. So we can change its odd return
value in the case where no values are found by "git config". The
difference in the *.pm code would matter if it was invoked in scalar
context, but now it no longer is.

Arguably this caching belongs in Git.pm itself, but in lieu of
modifying it for all its callers let's only do this for "git
send-email". The other big potential win would be "git svn", but
unlike "git send-email" it doesn't check tens of config variables one
at a time at startup (in my brief testing it doesn't check any).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
2b110e9ba2 send-email: copy "config_regxp" into git-send-email.perl
The config_regexp() function was added in dd84e528a3 (git-send-email:
die if sendmail.* config is set, 2020-07-23) for use in
git-send-email, and it's the only in-tree user of it.

However, the consensus is that Git.pm is a public interface, so even
though it's a recently added function we can't change it. So let's
copy over a minimal version of it to git-send-email.perl itself. In a
subsequent commit it'll be changed further for our own use.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
119974e9e7 send-email: refactor sendemail.smtpencryption config parsing
With the removal of the support for sendemail.smtpssl in the preceding
commit the parsing of sendemail.smtpencryption is no longer special,
and can by moved to %config_settings.

This gets us rid of an unconditional call to Git::config(), which as
we'll see in subsequent commits matters for startup performance.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
671818ab0b send-email: remove non-working support for "sendemail.smtpssl"
Remove the already dead code to support "sendemail.smtpssl" by finally
removing the dead code supporting the configuration option.

In f6bebd121a (git-send-email: add support for TLS via
Net::SMTP::SSL, 2008-06-25) the --smtp-ssl command-line option was
documented as deprecated, later in 65180c6618 (List send-email config
options in config.txt., 2009-07-22) the "sendemail.smtpssl"
configuration option was also documented as such.

Then in in 3ff15040e2 (send-email: fix regression in
sendemail.identity parsing, 2019-05-17) I unintentionally removed
support for it by introducing a bug in read_config().

As can be seen from the diff context we've already returned unless
$enc i defined, so it's not possible for us to reach the "elsif"
branch here. This code was therefore already dead since Git v2.23.0.

So let's just remove it. We were already 11 years into a stated
deprecation period of this variable when 3ff15040e2 landed, now it's
around 13. Since it hasn't worked anyway for around 2 years it looks
like we can safely remove it.

The --smtp-ssl option is still deprecated, if someone cares they can
follow-up and remove that too, but unlike the config option that one
could still be in use in the wild. I'm just removing this code that's
provably unused already.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
879be4319f send-email tests: test for boolean variables without a value
The Git.pm code does its own Perl-ifying of boolean variables, let's
ensure that empty values = true for boolean variables, as in the C
code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
ecc4ee9cac send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true
Add support for the "GIT_TEST_PERL_FATAL_WARNINGS=true" test mode to
"send-email". This was added to e.g. git-svn in 5338ed2b26 (perl:
check for perl warnings while running tests, 2020-10-21), but not
"send-email". Let's rectify that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 18:38:07 +09:00
4e42405f00 Git 2.32-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 13:05:29 +09:00
329d63e7be Merge branch 'en/dir-traversal'
Fix-up to a topic that is already in 'master'.

* en/dir-traversal:
  dir: introduce readdir_skip_dot_and_dotdot() helper
  dir: update stale description of treat_directory()
  Revert "dir: update stale description of treat_directory()"
  Revert "dir: introduce readdir_skip_dot_and_dotdot() helper"
2021-05-28 13:03:00 +09:00
906fc557b7 dir: introduce readdir_skip_dot_and_dotdot() helper
Many places in the code were doing
    while ((d = readdir(dir)) != NULL) {
        if (is_dot_or_dotdot(d->d_name))
            continue;
        ...process d...
    }
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
    while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
        ...process d...
    }

This helper particularly simplifies checks for empty directories.

Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms.  (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 14:02:37 +09:00
eef814828f dir: update stale description of treat_directory()
The documentation comment for treat_directory() was originally written
in 095952 (Teach directory traversal about subprojects, 2007-04-11)
which was before the 'struct dir_struct' split its bitfield of named
options into a 'flags' enum in 7c4c97c0 (Turn the flags in struct
dir_struct into a single variable, 2009-02-16). When those flags
changed, the comment became stale, since members like
'show_other_directories' transitioned into flags like
DIR_SHOW_OTHER_DIRECTORIES.

Update the comments for treat_directory() to use these flag names rather
than the old member names.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 14:02:37 +09:00
2c9f1bfdb4 Revert "dir: update stale description of treat_directory()"
This reverts commit 4e689d8171,
to be replaced with a reworked version.
2021-05-27 14:02:37 +09:00
1df046bcff Revert "dir: introduce readdir_skip_dot_and_dotdot() helper"
This reverts commit b548f0f156,
to be replaced with a reworked version.
2021-05-27 14:02:37 +09:00
5afd72a96f Merge branch 'ab/pack-linkage-fix'
"ld" on Solaris fails to link some test helpers, which has been
worked around by reshuffling the inline function definitions from a
header file to a source file that is the only user of them.

* ab/pack-linkage-fix:
  pack-objects: move static inline from a header to the sole consumer
2021-05-27 12:36:58 +09:00
2f0ca41349 Merge branch 'mt/t2080-cp-symlink-fix'
Test portability fix.

* mt/t2080-cp-symlink-fix:
  t2080: fix cp invocation to copy symlinks instead of following them
2021-05-27 12:36:57 +09:00
f4d715b0ac Merge branch 'ab/send-email-inline-hooks-path'
Code simplification.

* ab/send-email-inline-hooks-path:
  send-email: move "hooks_path" invocation to git-send-email.perl
  send-email: don't needlessly abs_path() the core.hooksPath
2021-05-27 12:36:57 +09:00
1accb34ce0 Merge branch 'ds/t1092-fix-flake-from-progress'
Workaround flaky tests introduced recently.

* ds/t1092-fix-flake-from-progress:
  t1092: revert the "-1" hack for emulating "no progress meter"
  t1092: use GIT_PROGRESS_DELAY for consistent results
2021-05-27 12:36:57 +09:00
7d089fb9b7 pack-objects: move static inline from a header to the sole consumer
Move the code that is only used in builtin/pack-objects.c out of
pack-objects.h.

This fixes an issue where Solaris's SunCC hasn't been able to compile
git since 483fa7f42d (t/helper/test-bitmap.c: initial commit,
2021-03-31).

The real origin of that issue is that in 898eba5e63 (pack-objects:
refer to delta objects by index instead of pointer, 2018-04-14)
utility functions only needed by builtin/pack-objects.c were added to
pack-objects.h. Since then the header has been used in a few other
places, but 483fa7f42d was the first time it was used by test helper.

Since Solaris is stricter about linking and the oe_get_size_slow()
function lives in builtin/pack-objects.c the build started failing
with:

    Undefined                       first referenced
     symbol                             in file
    oe_get_size_slow                    t/helper/test-bitmap.o
    ld: fatal: symbol referencing errors. No output written to t/helper/test-tool

On other platforms this is presumably OK because the compiler and/or
linker detects that the "static inline" functions that reference
oe_get_size_slow() aren't used.

Let's solve this by moving the relevant code from pack-objects.h to
builtin/pack-objects.c. This is almost entirely a code-only move, but
because of the early macro definitions in that file referencing some
of these inline functions we need to move the definition of "static
struct packing_data to_pack" earlier, and declare these inline
functions above the macros.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 12:14:41 +09:00
c2529290f0 Merge branch 'fr_next' of github.com:jnavila/git
* 'fr_next' of github.com:jnavila/git:
  l10n: fr: v2.32.0 round 1
  l10n: fr: fixed inconsistencies
  l10n: fr.po fixed inconsistencies
2021-05-27 10:28:50 +08:00
ea08db7473 t2080: fix cp invocation to copy symlinks instead of following them
t2080 makes a few copies of a test repository and later performs a
branch switch on each one of the copies to verify that parallel checkout
and sequential checkout produce the same results. However, the
repository is copied with `cp -R` which, on some systems, defaults to
following symlinks on the directory hierarchy and copying their target
files instead of copying the symlinks themselves. AIX is one example of
system where this happens. Because the symlinks are not preserved, the
copied repositories have paths that do not match what is in the index,
causing git to abort the checkout operation that we want to test. This
makes the test fail on these systems.

Fix this by copying the repository with the POSIX flag '-P', which
forces cp to copy the symlinks instead of following them. Note that we
already use this flag for other cp invocations in our test suite (see
t7001). With this change, t2080 now passes on AIX.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 09:04:49 +09:00
7cbc0455cc send-email: move "hooks_path" invocation to git-send-email.perl
Move the newly added "hooks_path" API in Git.pm to its only user in
git-send-email.perl. This was added in c8243933c7 (git-send-email:
Respect core.hooksPath setting, 2021-03-23), meaning that it hasn't
yet made it into a non-rc release of git.

The consensus with Git.pm is that we need to be considerate of
out-of-tree users who treat it as a public documented interface. We
should therefore be less willing to add new functionality to it, least
we be stuck supporting it after our own uses for it disappear.

In this case the git-send-email.perl hook invocation will probably be
replaced by a future "git hook run" command, and in the commit
preceding this one the "hooks_path" become nothing but a trivial
wrapper for "rev-parse --git-path hooks" anyway (with no
Cwd::abs_path() call), so let's just inline this command in
git-send-email.perl itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 09:00:59 +09:00
2815326f09 send-email: don't needlessly abs_path() the core.hooksPath
In c8243933c7 (git-send-email: Respect core.hooksPath setting,
2021-03-23) we started supporting core.hooksPath in "send-email". It's
been reported that on Windows[1] doing this by calling abs_path()
results in different canonicalizations of the absolute path.

This wasn't an issue in c8243933c7 itself, but was revealed by my
ea7811b37e (git-send-email: improve --validate error output,
2021-04-06) when we started emitting the path to the hook, which was
previously only internal to git-send-email.perl.

The just-landed 53753a37d0 (t9001-send-email.sh: fix expected
absolute paths on Windows, 2021-05-24) narrowly fixed this issue, but
I believe we can do better here. We should not be relying on whatever
changes Perl's abs_path() makes to the path "rev-parse --git-path
hooks" hands to us. Let's instead trust it, and hand it to Perl's
system() in git-send-email.perl. It will handle either a relative or
absolute path.

So let's revert most of 53753a37d0 and just have "hooks_path" return
what we get from "rev-parse" directly without modification. This has
the added benefit of making the error message friendlier in the common
case, we'll no longer print an absolute path for repository-local hook
errors.

1. http://lore.kernel.org/git/bb30fe2b-cd75-4782-24a6-08bb002a0367@kdbg.org

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 09:00:57 +09:00
a96355d84c t1092: revert the "-1" hack for emulating "no progress meter"
This looked like a good idea, but it seems to break tests on 32-bit
builds rather badly.  Revert to just use "100 thousands must be big
enough" for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-26 06:23:58 +09:00
1df318be63 l10n: id: po-id for 2.32.0 (round 1)
Translate following components:

  * builtin/add.c
  * worktree.c
  * builtin/branch.c
  * builtin/commit.c
  * builtin/merge.c
  * builtin/rebase.c
  * builtin/pull.c
  * diff.c
  * add-interactive.c
  * builtin/log.c
  * builtin/stash.c
  * builtin/tag.c
  * config.c
  * builtin/config.c
  * reset.c
  * builtin/remote.c
  * builtin/rm.c
  * builtin/mv.c
  * builtin/clean.c
  * builtin/help.c
  * archive.c
  * submodule.c
  * builtin/submodule--helper.c
  * submodule-config.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-05-25 18:08:16 +07:00
5d5b147345 Merge branch 'mt/init-template-userpath-fix'
Regression fix.

* mt/init-template-userpath-fix:
  init: fix bug regarding ~/ expansion in init.templateDir
2021-05-25 16:21:20 +09:00
d9929cbb08 Merge branch 'jt/send-email-validate-errors-fix'
Fix a test breakage.

* jt/send-email-validate-errors-fix:
  t9001-send-email.sh: fix expected absolute paths on Windows
2021-05-25 16:21:19 +09:00
53cb2103ce Merge branch 'ab/send-email-validate-errors-fix'
* ab/send-email-validate-errors-fix:
  send-email: fix missing error message regression
2021-05-25 16:21:19 +09:00
e2b05746e1 t1092: use GIT_PROGRESS_DELAY for consistent results
The t1092-sparse-checkout-compatibility.sh tests compare the stdout and
stderr for several Git commands across both full checkouts, sparse
checkouts with a full index, and sparse checkouts with a sparse index.
Since these are direct comparisons, sometimes a progress indicator can
flush at unpredictable points, especially on slower machines. This
causes the tests to be flaky.

One standard way to avoid this is to add GIT_PROGRESS_DELAY=0 to the Git
commands that are run, as this will force every progress indicator
created with start_progress_delay() to be created immediately. However,
there are some progress indicators that are created in the case of a
full index that are not created with a sparse index. Moreover, their
values may be different as those indexes have a different number of
entries.

Instead, use GIT_PROGRESS_DELAY=-1 (which will turn into UINT_MAX)
to ensure that any reasonable machine running these tests would
never display delayed progress indicators.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 15:30:33 +09:00
a185dd58ec init: fix bug regarding ~/ expansion in init.templateDir
We used to read the init.templateDir setting at builtin/init-db.c using
a git_config() callback that, in turn, called git_config_pathname(). To
simplify the config reading logic at this file and plug a memory leak,
this was replaced by a direct call to git_config_get_value() at
e4de4502e6 ("init: remove git_init_db_config() while fixing leaks",
2021-03-14). However, this function doesn't provide path expanding
semantics, like git_config_pathname() does, so paths with '~/' and
'~user/' are treated literally. This makes 'git init' fail to handle
init.templateDir paths using these constructs:

	$ git config init.templateDir '~/templates_dir'
	$ git init
	'warning: templates not found in ~/templates_dir'

Replace the git_config_get_value() call by git_config_get_pathname(),
which does the '~/' and '~user/' expansions. Also add a regression test.
Note that unlike git_config_get_value(), the config cache does not own
the memory for the path returned by git_config_get_pathname(), so we
must free() it.

Reported on IRC by rkta.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 13:22:08 +09:00
5b719b7552 send-email: fix missing error message regression
Fix a regression with the "the editor exited uncleanly, aborting
everything" error message going missing after my
d21616c039 (git-send-email: refactor duplicate $? checks into a
function, 2021-04-06).

I introduced a $msg variable, but did not actually use it. This caused
us to miss the optional error message supplied by the "do_edit"
codepath. Fix that, and add tests to check that this works.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 09:52:42 +09:00
53753a37d0 t9001-send-email.sh: fix expected absolute paths on Windows
Git for Windows is a native Windows program that works with native
absolute paths in the drive letter style C:\dir. The auxiliary
infrastructure is based on MSYS2, which uses POSIX style /C/dir.

When we test for output of absolute paths produced by git.exe, we
usally have to expect C:\dir style paths. To produce such expected
paths, we have to use $(pwd) in the test scripts; the alternative,
$PWD, produces a POSIX style path. ($PWD is a shell variable, and the
shell is bash, an MSYS2 program, and operates in the POSIX realm.)

There are two recently added tests that were written to expect C:\dir
paths. The output that is tested is produced by `git send-email`, but
behind the scenes, this is a Perl script, which also works in the
POSIX realm and produces /C/dir style output.

In the first test case that is changed here, replace $(pwd) by $PWD
so that the expected path is constructed using /C/dir style.

The second test case sets core.hooksPath to an absolute path. Since
the test script talks to native git.exe, it is supposed to place a
C:/dir style path into the configuration; therefore, keep $(pwd).
When this configuration value is consumed by the Perl script, it is
transformed to /C/dir style by the MSYS2 layer and echoed back in
this form in the error message. Hence, do use $PWD for the expected
value.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 09:45:17 +09:00
3127ff90ea packfile-uri.txt: fix blobPackfileUri description
Fix the 'uploadpack.blobPackfileUri' description in packfile-uri.txt
and the correct format also can be seen in t5702.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 09:31:06 +09:00
7ba3016729 doc: avoid using rm directly
That's what we have $(RM) for.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-24 18:08:22 +09:00
db10fc6c09 doc: simplify Makefile using .DELETE_ON_ERROR
Currently GNU make already removes files when catching an interruption
signal, however, in order to deal with other kinds of errors a
workaround is in place to store target output to a temporary file, and
only move it to its right place on success.

By enabling the built-in .DELETE_ON_ERROR we let make do this task, so
we don't have to.

This way the rules can be simplified a lot.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-24 18:08:22 +09:00
471e7b2cf6 doc: remove unnecessary rm instances
Commits 50cff52f1a (When generating manpages, delete outdated targets
first., 2007-08-02) and f9286765b2 (Documentation/Makefile: remove
cmd-list.made before redirecting to it., 2007-08-06) created these rm
instances for a very rare corner-case: building as root by mistake.

It's odd to have workarounds here, but nowhere else in the Makefile--
which already fails in this stuation, starting from
Documentation/technical/.

We gain nothing but complexity, so let's remove them.

Comments-by: Jeff King <peff@peff.net>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-24 18:08:22 +09:00
56da21392b doc: improve asciidoc dependencies
asciidoc needs asciidoc.conf, asciidoctor asciidoctor-extensions.rb.

Neither needs the other.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-24 18:08:22 +09:00
12d078ed2b doc: refactor common asciidoc dependencies
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-24 18:08:21 +09:00
11998a0364 l10n: vi.po(5204t): Updated Vietnamese translation for v2.32.0
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2021-05-24 13:54:03 +07:00
beded618b2 l10n: zh_TW.po: localized
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2021-05-23 15:31:29 +08:00
de88ac70f3 Git 2.32-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-22 18:29:01 +09:00
378c7c6ad4 Merge branch 'dl/stash-show-untracked-fixup'
Another brown paper bag inconsistency fix for a new feature
introduced during this cycle.

* dl/stash-show-untracked-fixup:
  stash show: use stash.showIncludeUntracked even when diff options given
2021-05-22 18:29:01 +09:00
6aae0e2ad2 Merge branch 'jh/simple-ipc-sans-pthread'
The "simple-ipc" did not compile without pthreads support, but the
build procedure was not properly account for it.

* jh/simple-ipc-sans-pthread:
  simple-ipc: correct ifdefs when NO_PTHREADS is defined
2021-05-22 18:29:01 +09:00
99fe1c6069 Merge branch 'wm/rev-parse-path-format-wo-arg'
The "rev-parse" command did not diagnose the lack of argument to
"--path-format" option, which was introduced in v2.31 era, which
has been corrected.

* wm/rev-parse-path-format-wo-arg:
  rev-parse: fix segfault with missing --path-format argument
2021-05-22 18:29:00 +09:00
5317dfeaed t: use configured TAR instead of tar
Despite that tar is available everywhere, it's not required by POSIX.

In our build system, users are allowed to specify which tar to be used
in Makefile knobs. Furthermore, GNU tar (gtar) is prefered when autotools
is being used.

In our testsuite, 7 out of 9 tar-required-tests use "$TAR", the other
two use "tar".

Let's change the remaining two tests to "$TAR".

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-22 18:01:57 +09:00
af5cd44b6f stash show: use stash.showIncludeUntracked even when diff options given
If options pertaining to how the diff is displayed is provided to
`git stash show`, the command will ignore the stash.showIncludeUntracked
configuration variable, defaulting to not showing any untracked files.
This is unintuitive behaviour since the format of the diff output and
whether or not to display untracked files are orthogonal.

Use stash.showIncludeUntracked even when diff options are given. Of
course, this is still overridable via the command-line options.

Update the documentation to explicitly say which configuration variables
will be overridden when a diff options are given.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-22 17:56:46 +09:00
a6eff43bea l10n: zh_TW.po: v2.32.0 round 1 (11 untranslated)
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2021-05-21 14:50:11 +08:00
f5bfcc823b diff-merges: let "-m" imply "-p"
Fix long standing inconsistency between -c/--cc that do imply -p on
one side, and -m that did not imply -p on the other side.

Change corresponding test accordingly, as "log -m" output should now
match one from "log -m -p", rather than from just "log".

Change documentation accordingly.

NOTES:

After this patch

  git log -m

produces diffs without need to provide -p as well, that improves both
consistency and usability. It gets even more useful if one sets
"log.diffMerges" configuration variable to "first-parent" to force -m
produce usual diff with respect to first parent only.

This patch, however, does not change behavior when specific diff
format is explicitly provided on the command-line, so that commands
like

  git log -m --raw
  git log -m --stat

are not affected, nor does it change commands where specific diff
format is active by default, such as:

  git diff-tree -m

It's also worth to be noticed that exact historical semantics of -m is
still provided by --diff-merges=separate.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:14 +09:00
fd16a39944 diff-merges: rename "combined_imply_patch" to "merges_imply_patch"
This is refactoring change in preparation for the next commit that
will let -m imply -p.

The old name doesn't match the intention to let not only -c/-cc imply
-p, but also -m, that is not a "combined" format, so we rename the
flag accordingly.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:14 +09:00
1e20a407fe stash list: stop passing "-m" to "git log"
Passing "-m" in "git log --first-parent -m" is not needed as
--first-parent implies --diff-merges=first-parent anyway. OTOH, it
will stop being harmless once we let "-m" imply "-p".

While we are at it, fix corresponding test description in t3903-stash
to match what it actually tests.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:14 +09:00
23f6d40dd3 git-svn: stop passing "-m" to "git rev-list"
rev-list doesn't utilize -m. It happens to eat it silently, so this
bug went unnoticed.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:14 +09:00
19b2517f95 diff-merges: move specific diff-index "-m" handling to diff-index
Move specific handling of "-m" for diff-index to diff-index.c, so
diff-merges is left to handle only diff for merges options.

Being a better design by itself, this is especially essential in
preparation for letting -m imply -p, as "diff-index -m" obviously
should not imply -p, as it's entirely unrelated.

To handle this, in addition to moving specific diff-index "-m" code
out of diff-merges, we introduce new

  diff_merges_suppress_options_parsing()

and call it before generic options processing in cmd_diff_index().

This new diff_merges_suppress_options_parsing() could then be reused
and called before invocations of setup_revisions() for other commands
that don't need --diff-merges options, but that's outside of the scope
of these patch series.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:14 +09:00
e0b16421b1 t4013: test "git diff-index -m"
-m in "git diff-index" means "match missing", that differs
from its meaning in "git diff". Let's check it in diff-index.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:14 +09:00
3ae7fe2b0f t4013: test "git diff-tree -m"
We want to ensure we don't affect plumbing commands with our changes
of "-m" semantics, so add corresponding test.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:13 +09:00
faf16d4e97 t4013: test "git log -m --stat"
This is to ensure we won't break different diff formats when we start
to imply "-p" by "-m".

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:13 +09:00
48229c193d t4013: test "git log -m --raw"
This is to ensure we won't break different diff formats when we start
to imply "-p" by "-m".

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:13 +09:00
7a55fa0e0b t4013: test that "-m" alone has no effect in "git log"
This is to notice current behavior that we are going to change when we
start to imply "-p" by "-m".

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:13 +09:00
6aac70a870 simple-ipc: correct ifdefs when NO_PTHREADS is defined
Simple IPC always requires threads (in addition to various
platform-specific IPC support).  Fix the ifdefs in the Makefile
to define SUPPORTS_SIMPLE_IPC when appropriate.

Previously, the Unix version of the code would only verify that
Unix domain sockets were available.

This problem was reported here:
https://lore.kernel.org/git/YKN5lXs4AoK%2FJFTO@coredump.intra.peff.net/T/#m08be8f1942ea8a2c36cfee0e51cdf06489fdeafc

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 07:55:00 +09:00
225f7fa847 help: fix small typo in error message
Classic string concatenation while forgetting a space character.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 07:52:10 +09:00
4e0a64a713 trace2: refactor to avoid gcc warning under -O3
Refactor tr2_dst_try_uds_connect() to avoid a gcc warning[1] that
appears under -O3 (but not -O2). This makes the build pass under
DEVELOPER=1 without needing a DEVOPTS=no-error.

This can be reproduced with GCC Debian 8.3.0-6, but not e.g. with
clang 7.0.1-8+deb10u2. We've had this warning since
ee4512ed48 (trace2: create new combined trace facility, 2019-02-22).

As noted in [2] this warning happens because the compiler doesn't
assume that errno must be non-zero after a failed syscall.

Let's work around by using the well-established "saved_errno" pattern,
along with returning -1 ourselves instead of "errno". The caller can
thus rely on our "errno" on failure.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61846 for a related
bug report against GCC.

1.

    trace2/tr2_dst.c: In function ‘tr2_dst_get_trace_fd.part.5’:
    trace2/tr2_dst.c:296:10: warning: ‘fd’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      dst->fd = fd;
      ~~~~~~~~^~~~
    trace2/tr2_dst.c:229:6: note: ‘fd’ was declared here
      int fd;
          ^~
2. https://lore.kernel.org/git/20200404142131.GA679473@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 06:56:11 +09:00
107691cb07 Merge branch 'ds/sparse-index-protections'
Fix access to uninitialized piece of memory, introduced during this
cycle.

* ds/sparse-index-protections:
  sparse-index: fix uninitialized jump
2021-05-21 05:50:38 +09:00
2b8b1aa6ad Merge branch 'tz/c-locale-output-is-no-more'
Test update.

* tz/c-locale-output-is-no-more:
  t7500: remove non-existant C_LOCALE_OUTPUT prereq
2021-05-21 05:50:32 +09:00
c69f2f8c86 Merge branch 'cs/http-use-basic-after-failed-negotiate'
Regression fix for a change made during this cycle.

* cs/http-use-basic-after-failed-negotiate:
  Revert "remote-curl: fall back to basic auth if Negotiate fails"
  t5551: test http interaction with credential helpers
2021-05-21 05:49:41 +09:00
733b9f59ba l10n: sv.po: Update Swedish translation (5204t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-05-20 13:32:48 +01:00
25e65b6dd5 merge-ort, diffcore-rename: employ cached renames when possible
When there are many renames between the old base of a series of commits
and the new base, the way sequencer.c, merge-recursive.c, and
diffcore-rename.c have traditionally split the work resulted in
redetecting the same renames with each and every commit being
transplanted.  To address this, the last several commits have been
creating a cache of rename detection results, determining when it was
safe to use such a cache in subsequent merge operations, adding helper
functions, and so on.  See the previous half dozen commit messages for
additional discussion of this optimization, particularly the message a
few commits ago entitled "add code to check for whether cached renames
can be reused".  This commit finally ties all of that work together,
modifying the merge algorithm to make use of these cached renames.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        5.665 s ±  0.129 s     5.622 s ±  0.059 s
    mega-renames:     11.435 s ±  0.158 s    10.127 s ±  0.073 s
    just-one-mega:   494.2  ms ±  6.1  ms   500.3  ms ±  3.8  ms

That's a fairly small improvement, but mostly because the previous
optimizations were so effective for these particular testcases; this
optimization only kicks in when the others don't.  If we undid the
basename-guided rename detection and skip-irrelevant-renames
optimizations, then we'd see that this series by itself improved
performance as follows:

                   Before Basename Series   After Just This Series
    no-renames:      13.815 s ±  0.062 s      5.697 s ±  0.080 s
    mega-renames:  1799.937 s ±  0.493 s    205.709 s ±  0.457 s

Since this optimization kicks in to help accelerate cases where the
previous optimizations do not apply, this last comparison shows that
this cached-renames optimization has the potential to help signficantly
in cases that don't meet the requirements for the other optimizations to
be effective.

The changes made in this optimization also lay some important groundwork
for a future optimization around having collect_merge_info() avoid
recursing into subtrees in more cases.

However, for this optimization to be effective, merge_switch_to_result()
should only be called when the rebase or cherry-pick operation has
either completed or hit a case where the user needs to resolve a
conflict or edit the result.  If it is called after every commit, as
sequencer.c does, then the working tree and index are needlessly updated
with every commit and the cached metadata is tossed, defeating this
optimization.  Refactoring sequencer.c to only call
merge_switch_to_result() at the end of the operation is a bigger
undertaking, and the practical benefits of this optimization will not be
realized until that work is performed.  Since `test-tool fast-rebase`
only updates at the end of the operation, it was used to obtain the
timings above.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
cbdca289fb merge-ort: handle interactions of caching and rename/rename(1to1) cases
As documented in Documentation/technical/remembering-renames.txt, and as
tested for in the two testcases in t6429 with "rename same file
identically" in their description, there is one case where we need to
have renames in one commit NOT be cached for the next commit in our
rebase sequence -- namely, rename/rename(1to1) cases.  Rather than
specifically trying to uncache those and fix up dir_rename_counts() to
match (which would also be valid but more work), we simply disable the
optimization when this really rare type of rename occurs.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
86b41b3895 merge-ort: add helper functions for using cached renames
If we have a usable rename cache, then we can remove from
relevant_sources all the paths that were cached;
diffcore_rename_extended() can then consider an even smaller set of
relevant_sources in its rename detection.

However, when diffcore_rename_extended() is done, we will need to take
the renames it detected and then add back in all the ones we had cached
from before.

Add helper functions for doing these two operations; the next commit
will make use of them.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
d509802993 merge-ort: preserve cached renames for the appropriate side
Previous commits created an in-memory cache of the results of rename
detection, and added logic to detect when that cache could appropriately
be used in a subsequent merge operation -- but we were still
unconditionally clearing the cache with each new merge operation anyway.
If it is valid to reuse the cache from one of the two sides of history,
preserve that side.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
19ceb486f8 merge-ort: avoid accidental API mis-use
Previously, callers of the merge-ort API could have passed an
uninitialized value for struct merge_result *result.  However, we want
to check result to see if it has cached renames from a previous merge
that we can reuse; such values would be found behind result->priv.
However, if result->priv is uninitialized, attempting to access behind
it will give a segfault.  So, we need result->priv to be NULL (which
will be the case if the caller does a memset(&result, 0)), or be written
by a previous call to the merge-ort machinery.  Documenting this
requirement may help, but despite being the person who introduced this
requirement, I still missed it once and it did not fail in a very clear
way and led to a long debugging session.

Add a _properly_initialized field to merge_result; that value will be
0 if the caller zero'ed the merge_result, it will be set to a very
specific value by a previous run by the merge-ort machinery, and if it's
uninitialized it will most likely either be 0 or some value that does
not match the specific one we'd expect allowing us to throw a much more
meaningful error.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
64aceb6d73 merge-ort: add code to check for whether cached renames can be reused
We need to know when renames detected in a previous merge operation can
be reused in a later merge operation.  Consider the following setup
(from the git-rebase manpage):

                     A---B---C topic
                    /
               D---E---F---G master

After rebasing, this will appear as:

                             A'--B'--C' topic
                            /
               D---E---F---G master

Further, let's say that 'oldfile' was renamed to 'newfile' between E
and G.  The rebase or cherry-pick of A onto G will involve a three-way
merge between E (as the merge base) and G and A.  After detecting the
rename between E:oldfile and G:newfile, there will be a three-way
content merge of the following:
    E:oldfile
    G:newfile
    A:oldfile
and produce a new result:
    A':newfile

Now, when we want to pick B onto A', we will need to do a three-way
merge between A (as the merge-base) and A' and B.  This will involve
a three-way content merge of
    A:oldfile
    A':newfile
    B:oldfile
but only if we can detect that A:oldfile is similar enough to A':newfile
to be used together in a three-way content merge, i.e. only if we can
detect that A:oldfile and A':newfile are a rename.  But we already know
that A:oldfile and A':newfile are similar enough to be used in a
three-way content merge, because that is precisely where A':newfile came
from in the previous merge.

Note that A & A' both appear in both merges.  That gives us the
condition under which we can reuse renames.

There are a couple important points about this optimization:

  - If the rebase or cherry-pick halts for user conflicts, these caches
    are NOT saved anywhere.  Thus, resuming a halted rebase or
    cherry-pick will result in no reused renames for the next commit.
    This is intentional, as user resolution can change files
    significantly and in ways that violate the similarity assumptions
    here.

  - Technically, in a *very* narrow case this might give slightly
    different results for rename detection.  Using the example above,
    if:
      * E:oldfile had 20 lines
      * G:newfile added 10 new lines at the beginning of the file
      * A:oldfile deleted all but the first three lines of the file
    then
      => A':newfile would have 13 lines, 3 of which matches those
         in A:oldfile.

    Consider the two cases:
      * Without this optimization:
        - the next step of the rebase operation (moving B to B')
          would not detect the rename betwen A:oldfile and A':newfile
        - we'd thus get a modify/delete conflict with the rebase
          operation halting for the user to resolve, and have both
          A':newfile and B:oldfile sitting in the working tree.
      * With this optimization:
        - the rename between A:oldfile and A':newfile would be detected
          via the cache of renames
        - a three-way merge between A:oldfile, A':newfile, and B:oldfile
          would commence and be written to A':newfile

    Now, is the difference in behavior a bug...or a bugfix?  I can't
    tell.  Given that A:oldfile and A':newfile are not very similar,
    when we three-way merge with B:oldfile it seems likely we'll hit a
    conflict for the user to resolve.  And it shouldn't be too hard for
    users to see why we did that three-way merge; oldfile and newfile
    *were* renames somewhere in the sequence.  So, most of these corner
    cases will still behave similarly -- namely, a conflict given to the
    user to resolve.  Also, consider the interesting case when commit B
    is a clean revert of commit A.  Without this optimization, a rebase
    could not both apply a weird patch like A and then immediately
    revert it; users would be forced to resolve merge conflicts.  With
    this optimization, it would successfully apply the clean revert.
    So, there is certainly at least one case that behaves better.  Even
    if it's considered a "difference in behavior", I think both behaviors
    are reasonable, and the time savings provided by this optimization
    justify using the slightly altered rename heuristics.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
2734f2e324 merge-ort: populate caches of rename detection results
Fill in cache_pairs, cached_target_names, and cached_irrelevant based on
rename detection results.  Future commits will make use of these values.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
d29bd6d73d merge-ort: add data structures for in-memory caching of rename detection
When there are many renames between the old base of a series of commits
and the new base for a series of commits, the sequence of merges
employed to transplant those commits (from a cherry-pick or rebase
operation) will repeatedly detect the exact same renames.  This is
wasted effort.

Add data structures which will be used to cache rename detection
results, along with the initialization and deallocation of these data
structures.  Future commits will populate these caches, detect the
appropriate circumstances when they can be used, and employ them to
avoid re-detecting the same renames repeatedly.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
a22099f552 t6429: testcases for remembering renames
We will soon be adding an optimization that caches (in memory only,
never written to disk) upstream renames during a sequence of merges such
as occurs during a cherry-pick or rebase operation.  Add several tests
meant to stress such an implementation to ensure it does the right
thing, and include a test whose outcome we will later change due to this
optimization as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
f9500261e0 fast-rebase: write conflict state to working tree, index, and HEAD
Previously, when fast-rebase hit a conflict, it simply aborted and left
HEAD, the index, and the working tree where they were before the
operation started.  While fast-rebase does not support restarting from a
conflicted state, write the conflicted state out anyway as it gives us a
way to see what the conflicts are and write tests that check for them.

This will be important in the upcoming commits, because sequencer.c is
only superficially integrated with merge-ort.c; in particular, it calls
merge_switch_to_result() after EACH merge instead of only calling it at
the end of all the sequence of merges (or when a conflict is hit).  This
not only causes needless updates to the working copy and index, but also
causes all intermediate data to be freed and tossed, preventing caching
information from one merge to the next.  However, integrating
sequencer.c more deeply with merge-ort.c is a big task, and making this
small extension to fast-rebase.c provides us with a simple way to test
the edge and corner cases that we want to make sure continue working.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
caba91c373 fast-rebase: change assert() to BUG()
assert() can succinctly document expectations for the code, and do so in
a way that may be useful to future folks trying to refactor the code and
change basic assumptions; it allows them to more quickly find some
places where their violations of previous assumptions trips things up.

Unfortunately, assert() can surround a function call with important
side-effects, which is a huge mistake since some users will compile with
assertions disabled.  I've had to debug such mistakes before in other
codebases, so I should know better.  Luckily, this was only in test
code, but it's still very embarrassing.  Change an assert() to an if
(...) BUG (...).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
bb80333c08 Documentation/technical: describe remembering renames optimization
Remembering renames on the upstream side of history in an early merge of
a rebase or cherry-pick for re-use in a latter merge of the same
operation makes pretty good intuitive sense.  However, trying to show
that it doesn't cause some subtle behavioral difference or some funny
edge or corner case is much more involved.  And, in fact, it does
introduce a subtle behavioral change.

Document all the assumptions, special cases, and logic involved in such
an optimization, and describe why this optimization is safe under the
current optimizations/features/etc. -- even when the subtle behavioral
change is triggered.

Part of the point of adding this document that goes over the
optimization in such laborious detail, is that it is possible that
significant future changes (optimizations or feature changes) could
interact with this optimization in interesting ways; this document is
here to help folks making big changes sanity check that the assumptions
and arguments underlying this optimization are still valid.  (As a side
note, creating this document forced me to review things in sufficient
detail that I found I was not properly caching directory-rename-induced
renames, resulting in the code not being aware of those renames and
causing unnecessary diffcore_rename_extended() calls in subsequent
merges.)

A subsequent commit will add several testcases based on this document
meant to stress-test the implementation and also document the case with
the subtle behavioral change.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:40:39 +09:00
a84216c684 doc: explain the use of color.pager
The current documentation for color.pager is technically correct, but
slightly misleading and doesn't really clarify the purpose of the
variable. As explained in the original thread which added it:

  https://lore.kernel.org/git/E1G6zPH-00062L-Je@moooo.ath.cx/

the point is to deal with pagers that don't understand colors. And hence it
being set to "true" is necessary for colorizing output to the pager, but
not sufficient by itself (you must also have enabled one of the other
color options, though note that these are set to "auto" by default these
days).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 15:37:10 +09:00
619418b993 l10n: fix typos in po/TEAMS
Find typos in "po/TEAMS" file using the "git-po-helper" program.  These
typos were introduced from commit v2.24.0-1-g9917eca794 (l10n: zh_TW:
add translation for v2.24.0, 2019-11-20 19:14:22 +0800).

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-05-20 12:56:10 +08:00
8013d7d9ee setup: split "extensions found" messages into singular and plural
It's easier to translate this way.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 13:36:58 +09:00
09667e9516 fetch: improve grammar of "shallow roots" message
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 13:36:33 +09:00
88dd4282d9 A handful more topics before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 08:55:00 +09:00
cb227d5cd6 Merge branch 'jk/test-chainlint-softer'
The "chainlint" feature in the test framework is a handy way to
catch common mistakes in writing new tests, but tends to get
expensive.  An knob to selectively disable it has been introduced
to help running tests that the developer has not modified.

* jk/test-chainlint-softer:
  t: avoid sed-based chain-linting in some expensive cases
2021-05-20 08:55:00 +09:00
02112fcb70 Merge branch 'en/prompt-under-set-u'
The bash prompt script (in contrib/) did not work under "set -u".

* en/prompt-under-set-u:
  git-prompt: work under set -u
2021-05-20 08:55:00 +09:00
36a255acd1 Merge branch 'zh/ref-filter-push-remote-fix'
The handling of "%(push)" formatting element of "for-each-ref" and
friends was broken when the same codepath started handling
"%(push:<what>)", which has been corrected.

* zh/ref-filter-push-remote-fix:
  ref-filter: fix read invalid union member bug
2021-05-20 08:55:00 +09:00
bdff0419da Merge branch 'ew/sha256-clone-remote-curl-fix'
"git clone" from SHA256 repository by Git built with SHA-1 as the
default hash algorithm over the dumb HTTP protocol did not
correctly set up the resulting repository, which has been corrected.

* ew/sha256-clone-remote-curl-fix:
  remote-curl: fix clone on sha256 repos
2021-05-20 08:54:59 +09:00
33be431c0c Merge branch 'en/dir-traversal'
"git clean" and "git ls-files -i" had confusion around working on
or showing ignored paths inside an ignored directory, which has
been corrected.

* en/dir-traversal:
  dir: introduce readdir_skip_dot_and_dotdot() helper
  dir: update stale description of treat_directory()
  dir: traverse into untracked directories if they may have ignored subfiles
  dir: avoid unnecessary traversal into ignored directory
  t3001, t7300: add testcase showcasing missed directory traversal
  t7300: add testcase showing unnecessary traversal into ignored directory
  ls-files: error out on -i unless -o or -c are specified
  dir: report number of visited directories and paths with trace2
  dir: convert trace calls to trace2 equivalents
2021-05-20 08:54:59 +09:00
2e2ed74be0 Merge branch 'ab/perl-makefile-cleanup'
Build procedure clean-up.

* ab/perl-makefile-cleanup:
  Makefile: make PERL_DEFINES recursively expanded
  perl: use mock i18n functions under NO_GETTEXT=Y
  Makefile: regenerate *.pm on NO_PERL_CPAN_FALLBACKS change
  Makefile: regenerate perl/build/* if GIT-PERL-DEFINES changes
  Makefile: don't re-define PERL_DEFINES
2021-05-20 08:54:58 +09:00
617480d75b refs: make explicit that ref_iterator_peel returns boolean
Use -1 as error return value throughout.

This removes spurious differences in the GIT_TRACE_REFS output, depending on the
ref storage backend active.

Before, the cached ref_iterator (but only that iterator!) would return
peel_object() output directly. No callers relied on the peel_status values
beyond success/failure. All calls to these functions go through
peel_iterated_oid(), which returns peel_object() as a fallback, but also
squashing the error values.

The iteration interface already passes REF_ISSYMREF and REF_ISBROKEN through the
flags argument, so the additional error values in enum peel_status provide no
value.

The ref iteration interface provides a separate peel() function because certain
formats (eg. packed-refs and reftable) can store the peeled object next to the
tag SHA1. Passing the peeled SHA1 as an optional argument to each_ref_fn maps
more naturally to the implementation of ref databases. Changing the code in this
way is left for a future refactoring.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 07:54:12 +09:00
ae1a7eefff fetch-pack: signal v2 server that we are done making requests
When fetching with the v0 protocol over ssh (or a local upload-pack with
pipes), the server closes the connection as soon as it is finished
sending the pack. So even though the client may still be operating on
the data via index-pack (e.g., resolving deltas, checking connectivity,
etc), the server has released all resources.

With the v2 protocol, however, the server considers the ssh session only
as a transport, with individual requests coming over it. After sending
the pack, it goes back to its main loop, waiting for another request to
come from the client. As a result, the ssh session hangs around until
the client process ends, which may be much later (because resolving
deltas, etc, may consume a lot of CPU).

This is bad for two reasons:

  - it's consuming resources on the server to leave open a connection
    that won't see any more use

  - if something bad happens to the ssh connection in the meantime (say,
    it gets killed by the network because it's idle, as happened in a
    real-world report), then ssh will exit non-zero, and we'll propagate
    the error up the stack.

The server is correct here not to hang up after serving the pack. The v2
protocol's design is meant to allow multiple requests like this, and
hanging up would be the wrong thing for a hypothetical client which was
planning to make more requests (though in practice, the git.git client
never would, and I doubt any other implementations would either).

The right thing is instead for the client to signal to the server that
it's not interested in making more requests. We can do that by closing
the pipe descriptor we use to write to ssh. This will propagate to the
server upload-pack as an EOF when it tries to read the next request (and
then it will close its half, and the whole connection will go away).

It's important to do this "half duplex" shutdown, because we have to do
it _before_ we actually receive the pack. This is an artifact of the way
fetch-pack and index-pack (or unpack-objects) interact. We hand the
connection off to index-pack (really, a sideband demuxer which feeds
it), and then wait until it returns. And it doesn't do that until it has
resolved all of the deltas in the pack, even though it was done reading
from the server long before.

So just closing the connection fully after index-pack returns would be
too late; we'd have held it open much longer than was necessary. And
teaching index-pack to close the connection is awkward. It's not even
seeing the whole conversation (the sideband demuxer is, but it doesn't
actually know what's in the packets, or when the end comes).

Note that this close() is happening deep within the transport code. It's
possible that a caller would want to perform other operations over the
same ssh transport after receiving the pack. But as of the current code,
none of the callers do, and there haven't been discussions of any plans
to change this. If we need to support that later, we can probably do so
by passing down a flag for "you're the last request on the transport;
it's OK to close" instead of the code just assuming that's true.

The description above all discusses v2 ssh, so it's worth thinking about
how this interacts with other protocols:

  - in v0 protocols, we could do the same half-duplex shutdown (it just
    goes into the v0 do_fetch_pack() instead). This does work, but since
    it doesn't have the same persistence problem in the first place,
    there's little reason to change it at this point.

  - local fetches against git-upload-pack on the same machine will
    behave the same as ssh (they are talking over two pipes, and see EOF
    on their input pipe)

  - fetches against git-daemon will run this same code, and close one of
    the descriptors. In practice, this won't do anything, since there
    our two descriptors are dups of each other, and not part of a
    half-duplex pair. The right thing would probably be to call
    shutdown(SHUT_WR) on it. I didn't bother with that here. It doesn't
    face the same error-code problem (since it's just a TCP connection),
    so it's really only an optimization problem. And git:// is not that
    widely used these days, and has less impact on server resources than
    an ssh termination.

  - v2 http doesn't suffer from this problem in the first place, as our
    pipes terminate at a local git-remote-https, which is passing data
    along as individual requests via curl. Probably curl is keeping the
    TCP/TLS connection open for more requests, and we might be able to
    tell it manually "hey, we are done making requests now". But I think
    that's much less important. It again doesn't suffer from the
    error-code problem, and HTTP keepalive is pretty well understood
    (importantly, the timeouts can be set low, because clients like curl
    know how to reconnect for subsequent requests if necessary). So it's
    probably not worth figuring out how to tell curl that we're done
    (though if we do, this patch is probably the first step anyway;
    fetch-pack closes the pipe back to remote-https, which would be the
    signal that it should tell curl we're done).

The code is pretty straightforward. We close the pipe at the right
moment, and set it to -1 to mark it as invalid. I modified the later
cleanup code to avoid calling close(-1). That's not strictly necessary,
since close(-1) is a noop, but hopefully makes things a bit more obvious
to a reader.

I suspect that trying to call more transport functions after the close()
(e.g., calling transport_fetch_refs() again) would fail, as it's not
smart enough to realize we need to re-open the ssh connection. But
that's already true when v0 is in use. And no current callers want to do
that (and again, the solution is probably a flag in the transport code
to keep things open, which can be added later).

There's no test here, as the situation it covers is inherently racy (the
question is when upload-pack exits, compared to when index-pack finishes
resolving deltas and exits). The rather gross shell snippet below does
recreate the problematic situation; when run on a sufficiently-large
repository (git.git works fine), it kills an "idle" upload-pack while
the client is resolving deltas, leading to a failed clone.

    (
	    git clone --no-local --progress . foo.git 2>&1
	    echo >&2 "clone exit code=$?"
    ) |
    tr '\r' '\n' |
    while read line
    do
	    case "$done,$line" in
	    ,Resolving*)
		    echo "hit resolving deltas; killing upload-pack"
		    killall -9 git-upload-pack
		    done=t
		    ;;
	    esac
    done

Reported-by: Greg Pflaum <greg.pflaum@pnp-hcl.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 07:38:40 +09:00
4d0a2a608d l10n: fr: v2.32.0 round 1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-05-19 18:58:25 +02:00
6aacb7d861 clone: clean up directory after transport_fetch_refs() failure
git-clone started respecting errors from the transport subsystem in
aab179d937 (builtin/clone.c: don't ignore transport_fetch_refs() errors,
2020-12-03). However, that commit didn't handle the cleanup of the
filesystem quite right.

The cleanup of the directory that cmd_clone() creates is done by an
atexit() handler, which we control with a flag. It starts as
JUNK_LEAVE_NONE ("clean up everything"), then progresses to
JUNK_LEAVE_REPO when we know we have a valid repo but not working tree,
and then finally JUNK_LEAVE_ALL when we have a successful checkout.

Most errors cause us to die(), which then triggers the handler to do the
right thing based on how far into cmd_clone() we got. But the checks
added by aab179d937 instead set the "err" variable and then jump to a
new "cleanup" label, which then returns our non-zero status. However,
the code after the cleanup label includes setting the flag to
JUNK_LEAVE_ALL, and so we accidentally leave the repository and working
tree in place.

One obvious option to fix this is to reorder the end of the function to
set the flag first, before cleanup code, and put the label between them.

But we can observe another small bug: the error return from
transport_fetch_refs() is generally "-1", and we propagate that to the
return value of cmd_clone(), which ultimately becomes the exit code of
the process. And we try to avoid transmitting negative values via exit
codes (only the low 8 bits are passed along as an unsigned value, though
in practice for "-1" this at least retains the property that it's
non-zero).

Instead, let's just die(). That makes us consistent with rest of the
code in the function. It does add a new "fatal:" line to the output, but
I'd argue that's a good thing:

  - in the rare case that the transport code didn't say anything, now
    the user gets _some_ error message

  - even if the transport code said something like "error: ssh died of
    signal 9", it's nice to also say "fatal" to indicate that we
    considered that to be a show-stopper.

Triggering this in the test suite turns out to be surprisingly
difficult. Almost every error we'd encounter, including ones deep inside
the transport code, cause us to just die() right there! However, one way
is to put a fake wrapper around git-upload-pack that sends the complete
packfile but exits with a failure code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 21:14:59 +09:00
e22f2daed0 docs: improve fast-forward in glossary content
The text was somewhat confusing between the revision itself and the author.

Signed-off-by: Reuven Yagel <robi@post.jce.ac.il>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 21:11:49 +09:00
f6e2cd0625 read-cache: delete unused hashing methods
These methods were marked as MAYBE_UNUSED in the previous change to
avoid a complicated diff. Delete them entirely, since we now use the
hashfile API instead of this custom hashing code.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 16:41:21 +09:00
410334ed52 read-cache: use hashfile instead of git_hash_ctx
The do_write_index() method in read-cache.c has its own hashing logic
and buffering mechanism. Specifically, the ce_write() method was
introduced by 4990aadc (Speed up index file writing by chunking it
nicely, 2005-04-20) and similar mechanisms were introduced a few months
later in c38138cd (git-pack-objects: write the pack files with a SHA1
csum, 2005-06-26). Based on the timing, in the early days of the Git
codebase, I figured that these roughly equivalent code paths were never
unified only because it got lost in the shuffle. The hashfile API has
since been used extensively in other file formats, such as pack-indexes,
multi-pack-indexes, and commit-graphs. Therefore, it seems prudent to
unify the index writing code to use the same mechanism.

I discovered this disparity while trying to create a new index format
that uses the chunk-format API. That API uses a hashfile as its base, so
it is incompatible with the custom code in read-cache.c.

This rewrite is rather straightforward. It replaces all writes to the
temporary file with writes to the hashfile struct. This takes care of
many of the direct interactions with the_hash_algo.

There are still some git_hash_ctx uses remaining: the extension headers
are hashed for use in the End of Index Entries (EOIE) extension. This
use of the git_hash_ctx is left as-is. There are multiple reasons to not
use a hashfile here, including the fact that the data is not actually
writing to a file, just a hash computation. These hashes do not block
our adoption of the chunk-format API in a future change to the index, so
leave it as-is.

The internals of the algorithms are mostly identical. Previously, the
hashfile API used a smaller 8KB buffer instead of the 128KB buffer from
read-cache.c. The previous change already unified these sizes.

There is one subtle point: we do not pass the CSUM_FSYNC to the
finalize_hashfile() method, which differs from most consumers of the
hashfile API. The extra fsync() call indicated by this flag causes a
significant peformance degradation that is noticeable for quick
commands that write the index, such as "git add". Other consumers can
absorb this cost with their more complicated data structure
organization, and further writing structures such as pack-files and
commit-graphs is rarely in the critical path for common user
interactions.

Some static methods become orphaned in this diff, so I marked them as
MAYBE_UNUSED. The diff is much harder to read if they are deleted during
this change. Instead, they will be deleted in the following change.

In addition to the test suite passing, I computed indexes using the
previous binaries and the binaries compiled after this change, and found
the index data to be exactly equal. Finally, I did extensive performance
testing of "git update-index --force-write" on repos of various sizes,
including one with over 2 million paths at HEAD. These tests
demonstrated less than 1% difference in behavior. As expected, the
performance should be considered unchanged. The previous changes to
increase the hashfile buffer size from 8K to 128K ensured this change
would not create a peformance regression.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 16:41:21 +09:00
2ca245f8be csum-file.h: increase hashfile buffer size
The hashfile API uses a hard-coded buffer size of 8KB and has ever since
it was introduced in c38138c (git-pack-objects: write the pack files
with a SHA1 csum, 2005-06-26). It performs a similar function to the
hashing buffers in read-cache.c, but that code was updated from 8KB to
128KB in f279894 (read-cache: make the index write buffer size 128K,
2021-02-18). The justification there was that do_write_index() improves
from 1.02s to 0.72s. Since our end goal is to have the index writing
code use the hashfile API, we need to unify this buffer size to avoid a
performance regression.

There is a buffer, 'check_buffer', that is used to verify the check_fd
file descriptor. When this buffer increases to 128K to fit the data
being flushed, it causes the stack to overflow the limits placed in the
test suite. To avoid issues with stack size, move both 'buffer' and
'check_buffer' to be heap pointers within 'struct hashfile'. The
'check_buffer' member is left as NULL unless check_fd is set in
hashfd_check(). Both buffers are cleared as part of finalize_hashfile()
which also frees the full structure.

Since these buffers are now on the heap, we can adjust their size based
on the needs of the consumer. In particular, callers to
hashfd_throughput() are expecting to report progress indicators as the
buffer flushes. These callers would prefer the smaller 8k buffer to
avoid large delays between updates, especially for users with slower
networks. When the progress indicator is not used, the larger buffer is
preferrable.

By adding a new trace2 region in the chunk-format API, we can see that
the writing portion of 'git multi-pack-index write' lowers from ~1.49s
to ~1.47s on a Linux machine. These effects may be more pronounced or
diminished on other filesystems. The end-to-end timing is too noisy to
have a definitive change either way.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 16:41:21 +09:00
aafa5df0df xsize_t: avoid implementation defined behavior when len < 0
The xsize_t helper aims to safely convert an off_t to a size_t,
erroring out when a file offset is too large to fit into a memory
address.  It does this by using two casts:

	size_t size = (size_t) len;
	if (len != (off_t) size)
		... error out ...

On a platform with sizeof(size_t) < sizeof(off_t), this check is safe
and correct.  The first cast truncates to a size_t by finding the
remainder modulo SIZE_MAX+1 (see C99 section 6.3.1.3 Signed and
unsigned integers) and the second promotes to an off_t, meaning the
result is true if and only if len is representable as a size_t.

On other platforms, this two-casts strategy still works well (always
succeeds) for len >= 0.  But for len < 0, when the first cast succeeds
and produces SIZE_MAX + 1 + len, the resulting value is too large to
be represented as an off_t, so the second cast produces implementation
defined behavior.  In practice, it is likely to produce a result of
true despite len not being representable as size_t.

Simplify by replacing with a more straightforward check: compare len
to the relevant bounds and then cast it.  (To avoid a -Wsign-compare
warning, after checking that len >= 0, we explicitly convert to a
sufficiently-large unsigned type before comparing to SIZE_MAX.)

In practice, this is not likely to come up since typical callers use
nonnegative len.  Still, it's helpful to handle this case to make the
behavior easy to reason about.

Historical note: the original bounds-checking in 46be82dfd0 (xsize_t:
check whether we lose bits, 2010-07-28) did not produce this
implementation-defined behavior, though it still did not handle
negative offsets.  It was not until 73560c793a (git-compat-util.h:
xsize_t() - avoid -Wsign-compare warnings, 2017-09-21) introduced the
double cast that the implementation-defined behavior was triggered.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 15:00:30 +09:00
ecf7b129fa Revert "remote-curl: fall back to basic auth if Negotiate fails"
This reverts commit 1b0d9545bb.

That commit does fix the situation it intended to (avoiding Negotiate
even when the credentials were provided in the URL), but it creates a
more serious regression: we now never hit the conditional for "we had a
username and password, tried them, but the server still gave us a 401".
That has two bad effects:

 1. we never call credential_reject(), and thus a bogus credential
    stored by a helper will live on forever

 2. we never return HTTP_NOAUTH, so the error message the user gets is
    "The requested URL returned error: 401", instead of "Authentication
    failed".

Doing this correctly seems non-trivial, as we don't know whether the
Negotiate auth was a problem. Since this is a regression in the upcoming
v2.23.0 release (for which we're in -rc0), let's revert for now and work
on a fix separately.

(Note that this isn't a pure revert; the previous commit added a test
showing the regression, so we can now flip it to expect_success).

Reported-by: Ben Humphreys <behumphreys@atlassian.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 10:09:58 +09:00
b694f1e49e t5551: test http interaction with credential helpers
We test authentication with http, and we independently test that
credential helpers work, but we don't have any tests that cover the
two features working together. Let's add two:

  1. Make sure that a successful request asks the helper to save the
     credential. This works as expected.

  2. Make sure that a failed request asks the helper to forget the
     credential. This is marked as expect_failure, as it was recently
     regressed by 1b0d9545bb (remote-curl: fall back to basic auth if
     Negotiate fails, 2021-03-22). The symptom here is that the second
     request should prompt the user, but doesn't.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 10:09:57 +09:00
f302c1e4aa revisions(7): clarify that most commands take a single revision range
Sometimes new people are confused by how a revision "range" works,
in that it is not a random collection of commits but a set of
commits that are all connected to each other, and most Git commands
work on a single such "range".

Give an example to clarify it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-18 10:08:03 +09:00
68142e117c hashfile: use write_in_full()
The flush() logic in csum-file.c was introduced originally by c38138c
(git-pack-objects: write the pack files with a SHA1 csum, 2005-06-26)
and a portion of the logic performs similar utility to write_in_full()
in wrapper.c. The history of write_in_full() is full of moves and
renames, but was originally introduced by 7230e6d (Add write_or_die(), a
helper function, 2006-08-21).

The point of these sections of code are to flush a write buffer using
xwrite() and report errors in the case of disk space issues or other
generic input/output errors. The logic in flush() can interpret the
output of write_in_full() to provide the correct error messages to
users.

The logic in the hashfile API has an additional set of logic to augment
the progress indicator between calls to xwrite(). This was introduced by
2a128d6 (add throughput display to git-push, 2007-10-30). It seems that
since the hashfile's buffer is only 8KB, these additional progress
indicators might not be incredibly necessary. Instead, update the
progress only when write_in_full() complete.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-18 06:32:35 +09:00
4279cb1c6e sparse-index: fix uninitialized jump
While testing the sparse-index, I verified a test with --valgrind and it
complained about an uninitialized value being used in a jump in the
path_matches_pattern_list() method. The line was this one:

	if (*dtype == DT_UNKNOWN)

In the call stack, the culprit was the initialization of the dtype
variable in convert_to_sparse_rec().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-18 06:29:17 +09:00
3d20ed27b8 parallel-checkout: send the new object_id algo field to the workers
An object_id storing a SHA-1 name has some unused bytes at the end of
the hash array. Since these bytes are not used, they are usually not
initialized to any value either. However, at
parallel_checkout.c:send_one_item() the object_id of a cache entry is
copied into a buffer which is later sent to a checkout worker through a
pipe write(). This makes Valgrind complain about passing uninitialized
bytes to a syscall. The worker won't use these uninitialized bytes
either, but the warning could confuse someone trying to debug this code;
So instead of using oidcpy(), send_one_item() uses hashcpy() to only
copy the used/initialized bytes of the object_id, and leave the
remaining part with zeros.

However, since cf0983213c ("hash: add an algo member to struct
object_id", 2021-04-26), using hashcpy() is no longer sufficient here as
it won't copy the new algo field from the object_id. Let's add and use a
new function which meets both our requirements of copying all the
important object_id data while still avoiding the uninitialized bytes,
by padding the end of the hash array in the destination object_id. With
this change, we also no longer need the destination buffer from
send_one_item() to be initialized with zeros, so let's switch from
xcalloc() to xmalloc() to make this clear.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-18 05:38:54 +09:00
58cf6056c9 t7500: remove non-existant C_LOCALE_OUTPUT prereq
The C_LOCALE_OUTPUT prerequisite was removed in b1e079807b (tests:
remove last uses of C_LOCALE_OUTPUT, 2021-02-11), where Ævar noted:

    I'm not leaving the prerequisite itself in place for in-flight changes
    as there currently are none that introduce new tests that rely on it,
    and because C_LOCALE_OUTPUT is currently a noop on the master branch
    we likely won't have any new submissions that use it.

One more use of C_LOCALE_OUTPUT did creep in with 3d1bda6b5b (t7500: add
tests for --fixup=[amend|reword] options, 2021-03-15).  This causes a
number of the tests to be skipped by default:

    ok 35 # SKIP --fixup=reword: incompatible with --all (missing C_LOCALE_OUTPUT)
    ok 36 # SKIP --fixup=reword: incompatible with --include (missing C_LOCALE_OUTPUT)
    ok 37 # SKIP --fixup=reword: incompatible with --only (missing C_LOCALE_OUTPUT)
    ok 38 # SKIP --fixup=reword: incompatible with --interactive (missing C_LOCALE_OUTPUT)
    ok 39 # SKIP --fixup=reword: incompatible with --patch (missing C_LOCALE_OUTPUT)

Remove the C_LOCALE_OUTPUT prerequisite from these tests so they are
not skipped.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-18 04:48:30 +09:00
99234d5905 l10n: tr: v2.32.0-r1
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-05-17 20:32:32 +03:00
7434b92798 l10n: fr: fixed inconsistencies
Signed-off-by: rlespinasse <romain.lespinasse@gmail.com>
2021-05-17 15:16:25 +02:00
9bafe049d6 l10n: fr.po fixed inconsistencies
Signed-off-by: Vincent Tam <sere@live.hk>
2021-05-17 15:16:25 +02:00
e2c5993744 rev-parse: mark die() messages for translation
These error messages are intended for the user. Let's touch them up
since we're here from the previous commit.

Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 18:39:53 +09:00
99fc555188 rev-parse: fix segfault with missing --path-format argument
Calling "git rev-parse --path-format" without an argument segfaults
instead of giving an error message. Commit fac60b8925 (rev-parse: add
option for absolute or relative path formatting, 2020-12-13) added the
argument parsing code but forgot to handle NULL.

Returning an error makes sense here because there is no default value we
could use. Add a test case to verify.

Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 18:39:29 +09:00
e9488197ad l10n: pt_PT: add Portuguese translations part 2
* Eliminated 'Negation of emptiness' of 'nenhum' (not one/none)
* Eliminated 'Negation of emptiness' of 'nada' (nothing)
* Transformed 'Não' (No) into affirmative
* Some other translations
* Transforming 'não' (no) into affirmative
* From junção-de-3 to tri-junção

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-05-17 09:42:44 +01:00
2b95ebb4fd l10n: git.pot: v2.32.0 round 1 (126 new, 26 removed)
Generate po/git.pot from v2.32.0-rc0 for git v2.32.0 l10n round 1.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-05-17 16:06:49 +08:00
bfe35a6165 describe-doc: clarify default length of abbreviation
Clarify the default length used for the abbreviated form used for
commits in git describe.

The behavior was modified in Git 2.11.0, but the documentation was not
updated to clarify the new behavior.

Signed-off-by: Anders Höckersten <anders@hockersten.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 15:56:29 +09:00
72ee47ceeb mailinfo: don't discard names under 3 characters
I sometimes receive patches from people with short mononyms, and in my
cultural environment these are not uncommon. To my dismay, git-am
currently discards their names, and replaces them with their email
addresses.

Link: https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
Signed-off-by: edef <edef@edef.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 07:35:43 +09:00
f5f5a61d5a submodule: use the imperative mood to describe the --files option
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 07:31:40 +09:00
4901884a23 stash: don't translate literal commands
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 07:21:04 +09:00
cd5b33fbdc git-send-email: add option to specify sendmail command
The sendemail.smtpServer configuration option and --smtp-server command
line option both support using a sendmail-like program to send emails by
specifying an absolute file path. However, this is not ideal for the
following reasons:

1. It overloads the meaning of smtpServer (now a program is being used
   for the server?)
2. It doesn't allow for non-absolute paths, arguments, or arbitrary
   scripting

Requiring an absolute path is bad for portability, as the same program
may be in different locations on different systems. If a user wishes to
pass arguments to their program, they have to use the smtpServerOption
option, which is cumbersome (as it must be repeated for each option) and
doesn't adhere to normal git conventions.

Introduce a new configuration option sendemail.sendmailCmd as well as a
command line option --sendmail-cmd that can be used to specify a command
(with or without arguments) or shell expression to run to send email.
The name of this option is consistent with --to-cmd and --cc-cmd. This
invocation honors the user's $PATH so that absolute paths are not
necessary. Arbitrary shell expressions are also supported, allowing
users to do basic scripting.

Give this option a higher precedence over --smtp-server and
sendemail.smtpServer, as the new interface is more flexible. For
backward compatibility, continue to support absolute paths in
--smtp-server and sendemail.smtpServer.

Signed-off-by: Gregory Anders <greg@gpanders.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 07:06:13 +09:00
bf949ade81 Git 2.32-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-16 21:05:24 +09:00
e004fd6b69 Merge branch 'ls/typofix'
* ls/typofix:
  pretty: fix a typo in the documentation for %(trailers)
2021-05-16 21:05:24 +09:00
a8a2491e62 Merge branch 'dl/stash-show-untracked-fixup'
The code to handle options recently added to "git stash show"
around untracked part of the stash segfaulted when these options
were used on a stash entry that does not record untracked part.

* dl/stash-show-untracked-fixup:
  stash show: fix segfault with --{include,only}-untracked
  t3905: correct test title
2021-05-16 21:05:24 +09:00
16f91451fa Merge branch 'wc/packed-ref-removal-cleanup'
When "git update-ref -d" removes a ref that is packed, it left
empty directories under $GIT_DIR/refs/ for

* wc/packed-ref-removal-cleanup:
  refs: cleanup directories when deleting packed ref
2021-05-16 21:05:24 +09:00
94294e92e1 Merge branch 'lh/maintenance-leakfix'
* lh/maintenance-leakfix:
  maintenance: fix two memory leaks
2021-05-16 21:05:24 +09:00
caf6840be0 Merge branch 'ma/typofixes'
A couple of trivial typofixes.

* ma/typofixes:
  pretty-formats.txt: add missing space
  git-repack.txt: remove spurious ")"
2021-05-16 21:05:24 +09:00
c7c7c460f8 Merge branch 'ah/merge-ort-i18n'
An i18n fix.

* ah/merge-ort-i18n:
  merge-ort: split "distinct types" message into two translatable messages
2021-05-16 21:05:23 +09:00
483932a3d8 Merge branch 'dd/mailinfo-quoted-cr'
"git mailinfo" (hence "git am") learned the "--quoted-cr" option to
control how lines ending with CRLF wrapped in base64 or qp are
handled.

* dd/mailinfo-quoted-cr:
  am: learn to process quoted lines that ends with CRLF
  mailinfo: allow stripping quoted CR without warning
  mailinfo: allow squelching quoted CRLF warning
  mailinfo: warn if CRLF found in decoded base64/QP email
  mailinfo: stop parsing options manually
  mailinfo: load default metainfo_charset lazily
2021-05-16 21:05:23 +09:00
c8e34a7ac2 Merge branch 'ab/sparse-index-cleanup'
Code clean-up.

* ab/sparse-index-cleanup:
  sparse-index.c: remove set_index_sparse_config()
2021-05-16 21:05:23 +09:00
502a67891c Merge branch 'ab/streaming-simplify'
Code clean-up.

* ab/streaming-simplify:
  streaming.c: move {open,close,read} from vtable to "struct git_istream"
  streaming.c: stop passing around "object_info *" to open()
  streaming.c: remove {open,close,read}_method_decl() macros
  streaming.c: remove enum/function/vtbl indirection
  streaming.c: avoid forward declarations
2021-05-16 21:05:23 +09:00
a737e1f1d2 Merge branch 'mt/parallel-checkout-part-3'
The final part of "parallel checkout".

* mt/parallel-checkout-part-3:
  ci: run test round with parallel-checkout enabled
  parallel-checkout: add tests related to .gitattributes
  t0028: extract encoding helpers to lib-encoding.sh
  parallel-checkout: add tests related to path collisions
  parallel-checkout: add tests for basic operations
  checkout-index: add parallel checkout support
  builtin/checkout.c: complete parallel checkout support
  make_transient_cache_entry(): optionally alloc from mem_pool
2021-05-16 21:05:23 +09:00
644f4a2046 Merge branch 'jt/push-negotiation'
"git push" learns to discover common ancestor with the receiving
end over protocol v2.

* jt/push-negotiation:
  send-pack: support push negotiation
  fetch: teach independent negotiation (no packfile)
  fetch-pack: refactor command and capability write
  fetch-pack: refactor add_haves()
  fetch-pack: refactor process_acks()
2021-05-16 21:05:22 +09:00
a30e43f61a merge: don't translate literal commands
These strings have not been modified in any translation, nor should they
be.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-16 13:00:28 +09:00
afb82d1db0 l10n: Update Catalan translation
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
2021-05-14 21:14:52 +02:00
97eea85a0a The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-14 08:26:11 +09:00
52371bf449 Merge branch 'mt/clean-clean'
Code clean-up.

* mt/clean-clean:
  clean: remove unnecessary variable
2021-05-14 08:26:11 +09:00
47fa106617 Merge branch 'ow/no-dryrun-in-add-i'
"git add -i --dry-run" does not dry-run, which was surprising.  The
combination of options has taught to error out.

* ow/no-dryrun-in-add-i:
  add: die if both --dry-run and --interactive are given
2021-05-14 08:26:09 +09:00
e289f681ed Merge branch 'jk/p4-locate-branch-point-optim'
"git p4" learned to find branch points more efficiently.

* jk/p4-locate-branch-point-optim:
  git-p4: speed up search for branch parent
  git-p4: ensure complex branches are cloned correctly
2021-05-14 08:26:08 +09:00
eede71149e Merge branch 'ba/object-info'
Over-the-wire protocol learns a new request type to ask for object
sizes given a list of object names.

* ba/object-info:
  object-info: support for retrieving object info
2021-05-14 08:26:08 +09:00
daffa8961b Merge branch 'pw/patience-diff-clean-up'
Code clean-up.

* pw/patience-diff-clean-up:
  patience diff: remove unused variable
  patience diff: remove unnecessary string comparisons
2021-05-14 08:26:08 +09:00
65c18913de Merge branch 'pw/word-diff-zero-width-matches'
The word-diff mode has been taught to work better with a word
regexp that can match an empty string.

* pw/word-diff-zero-width-matches:
  word diff: handle zero length matches
2021-05-14 08:26:06 +09:00
1197f1a463 ref-filter: introduce enum atom_type
In the original ref-filter design, it will copy the parsed
atom's name and attributes to `used_atom[i].name` in the
atom's parsing step, and use it again for string matching
in the later specific ref attributes filling step. It use
a lot of string matching to determine which atom we need.

Introduce the enum "atom_type", each enum value is named
as `ATOM_*`, which is the index of each corresponding
valid_atom entry. In the first step of the atom parsing,
`used_atom.atom_type` will record corresponding enum value
from valid_atom entry index, and then in specific reference
attribute filling step, only need to compare the value of
the `used_atom[i].atom_type` to check the atom type.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-14 06:37:28 +09:00
0caf20f228 ref-filter: add objectsize to used_atom
When the support for "objectsize:disk" was bolted onto the
existing support for "objectsize", it didn't follow the
usual pattern for handling "atomtype:modifier", which reads
the <modifier> part just once while parsing the format
string, and store the parsed result in the union in the
used_atom structure, so that the string form of it does not
have to be parsed over and over at runtime (e.g. in
grab_common_values()).

Add a new member `objectsize` to the union `used_atom.u`,
so that we can separate the check of <modifier> from the
check of <atomtype>, this will bring scalability to atom
`%(objectsize)`.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-14 06:37:27 +09:00
e5b66bb324 l10n: ru.po: fix typo in Russian translation
Signed-off-by: Alexey Roslyakov <alexey.roslyakov@gmail.com>
2021-05-13 12:48:37 +03:00
2d86a96220 t: avoid sed-based chain-linting in some expensive cases
Commit 878f988350 (t/test-lib: teach --chain-lint to detect broken
&&-chains in subshells, 2018-07-11) introduced additional chain-lint
tests which add an extra "sed" pipeline to each test we run. This has a
measurable impact on runtime. Here are timings with and without a new
environment variable (added by this patch) that lets you disable just
the additional sed-based chain-lint tests:

  Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 make test
    Time (mean ± σ):     64.202 s ±  1.030 s    [User: 622.469 s, System: 301.402 s]
    Range (min … max):   61.571 s … 65.662 s    10 runs

  Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 make test
    Time (mean ± σ):     57.591 s ±  0.333 s    [User: 529.368 s, System: 270.618 s]
    Range (min … max):   57.143 s … 58.309 s    10 runs

  Summary
    'GIT_TEST_CHAIN_LINT_HARDER=0 make test' ran
      1.11 ± 0.02 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 make test'

Of course those extra lint checks are doing something useful, so paying
a few extra seconds (at least on Linux) isn't so bad (though note the
CPU time; we're bounded in our parallel run here by the slowest test, so
it really is ~120s of CPU improvement).

But we can observe that there are some test scripts where they produce a
much stronger effect, and provide less value. In t0027 and t3070 we run
a very large number of small tests, all driven by a series of
functions/loops which are filling in the test bodies. There we get much
less bang for our buck in terms of bug-finding versus CPU cost.

This patch introduces a mechanism for controlling when those extra
lint checks are run, at two levels:

  - a user can ask to disable or to force-enable the checks by setting
    GIT_TEST_CHAIN_LINT_HARDER

  - if the user hasn't specified a preference, individual scripts can
    disable the checks by setting GIT_TEST_CHAIN_LINT_HARDER_DEFAULT;
    scripts which don't set that get the current behavior of enabling
    them.

In addition, this patch flips the default for t0027 and t3070's
mass-generated sections to disable the extra checks. Here are the timing
results for t0027:

  Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 ./t0027-auto-crlf.sh
    Time (mean ± σ):     17.078 s ±  0.848 s    [User: 14.878 s, System: 7.075 s]
    Range (min … max):   15.952 s … 18.421 s    10 runs

  Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 ./t0027-auto-crlf.sh
    Time (mean ± σ):      9.063 s ±  0.759 s    [User: 7.890 s, System: 3.362 s]
    Range (min … max):    7.747 s … 10.619 s    10 runs

  Benchmark #3: ./t0027-auto-crlf.sh
    Time (mean ± σ):      9.186 s ±  0.881 s    [User: 7.957 s, System: 3.427 s]
    Range (min … max):    7.796 s … 10.498 s    10 runs

  Summary
    'GIT_TEST_CHAIN_LINT_HARDER=0 ./t0027-auto-crlf.sh' ran
      1.01 ± 0.13 times faster than './t0027-auto-crlf.sh'
      1.88 ± 0.18 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 ./t0027-auto-crlf.sh'

We can see that disabling the checks for the whole script buys us an
almost 2x speedup. But the new default behavior, disabling them only for
the mass-generated part, gets us most of that speedup (but still leaves
the checks on for further manual tests people might write).

  As a side note, I'd caution about comparing runtimes and CPU seconds
  between this timing and the earlier "make test" one. In "make test",
  we're running a lot of scripts in parallel, so the CPU is throttling
  down (and thus a CPU second saved here would count for more during a
  parallel run; the same work takes more CPU seconds there).

We get similar results for t3070:

  Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 ./t3070-wildmatch.sh
    Time (mean ± σ):     20.054 s ±  3.967 s    [User: 16.003 s, System: 8.286 s]
    Range (min … max):   11.891 s … 23.671 s    10 runs

  Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 ./t3070-wildmatch.sh
    Time (mean ± σ):     12.399 s ±  2.256 s    [User: 7.542 s, System: 5.342 s]
    Range (min … max):    9.606 s … 15.727 s    10 runs

  Benchmark #3: ./t3070-wildmatch.sh
    Time (mean ± σ):     10.726 s ±  3.476 s    [User: 6.790 s, System: 4.365 s]
    Range (min … max):    5.444 s … 15.376 s    10 runs

  Summary
    './t3070-wildmatch.sh' ran
      1.16 ± 0.43 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=0 ./t3070-wildmatch.sh'
      1.87 ± 0.71 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 ./t3070-wildmatch.sh'

Again, we get almost a 2x speedup disabling these. In this case, there
are no tests not covered by the script's "default to disable" behavior,
so the second two benchmarks should be the same (and while they do
differ, you can see the variance is quite high but they're within one
standard deviation).

So it seems like for these two scripts, at least, disabling the extra
checks is a reasonable tradeoff. Sadly, the overall runtime of "make
test" on my system doesn't get much faster. But that's because we're
mostly limited by the cost of the single biggest test. Here are the
top-5 tests by wall-clock time from a parallel run, before my patch:

  57.9192368984222 t9001-send-email.sh
  45.6329638957977 t0027-auto-crlf.sh
  32.5278220176697 t3070-wildmatch.sh
  22.2701289653778 t7610-mergetool.sh
  20.8635759353638 t1701-racy-split-index.sh

And after:

  57.1476998329163 t9001-send-email.sh
  33.776211977005 t0027-auto-crlf.sh
  21.3116669654846 t7610-mergetool.sh
  20.7748689651489 t1701-racy-split-index.sh
  19.6957249641418 t7112-reset-submodule.sh

We dropped 12s from t0027, and t3070 dropped off our list entirely at
around 16s. In both cases we're bound by t9001, but its slowness is
due to the actual tests, so we'll have to deal with it in a different
way. But this reduces overall CPU, and means that dealing with t9001 (by
improving the speed of send-email or splitting it apart) will let us
reduce our overall runtime even on multi-core machines.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 15:50:44 +09:00
5c0cbdb107 git-prompt: work under set -u
Commit afda36dbf3 ("git-prompt: include sparsity state as well",
2020-06-21) added the use of some variables to control how to show
sparsity state in the git prompt, but implicitly assumed that undefined
variables would be treated as the empty string.  This breaks users who
run under 'set -u'; fix the code to be more explicit.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 15:50:26 +09:00
1ff595d218 stash show: fix segfault with --{include,only}-untracked
When `git stash show --include-untracked` or
`git stash show --only-untracked` is run on a stash that doesn't include
an untracked entry, a segfault occurs. This happens because we do not
check whether the untracked entry is actually present and just attempt
to blindly dereference it.

Ensure that the untracked entry is present before actually attempting to
dereference it.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:48:59 +09:00
aa2b05d9f6 t3905: correct test title
We reference the non-existent option `git stash show --show-untracked`
when we really meant `--only-untracked`. Correct the test title
accordingly.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:48:16 +09:00
b548f0f156 dir: introduce readdir_skip_dot_and_dotdot() helper
Many places in the code were doing
    while ((d = readdir(dir)) != NULL) {
        if (is_dot_or_dotdot(d->d_name))
            continue;
        ...process d...
    }
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
    while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
        ...process d...
    }

This helper particularly simplifies checks for empty directories.

Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms.  (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
4e689d8171 dir: update stale description of treat_directory()
The documentation comment for treat_directory() was originally written
in 095952 (Teach directory traversal about subprojects, 2007-04-11)
which was before the 'struct dir_struct' split its bitfield of named
options into a 'flags' enum in 7c4c97c0 (Turn the flags in struct
dir_struct into a single variable, 2009-02-16). When those flags
changed, the comment became stale, since members like
'show_other_directories' transitioned into flags like
DIR_SHOW_OTHER_DIRECTORIES.

Update the comments for treat_directory() to use these flag names rather
than the old member names.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
dd55fc0df1 dir: traverse into untracked directories if they may have ignored subfiles
A directory that is untracked does not imply that all files under it
should be categorized as untracked; in particular, if the caller is
interested in ignored files, many files or directories underneath the
untracked directory may be ignored.  We previously partially handled
this right with DIR_SHOW_IGNORED_TOO, but missed DIR_SHOW_IGNORED.  It
was not obvious, though, because the logic for untracked and excluded
files had been fused together making it harder to reason about.  The
previous commit split that logic out, making it easier to notice that
DIR_SHOW_IGNORED was missing.  Add it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
aa6e1b21e5 dir: avoid unnecessary traversal into ignored directory
The show_other_directories case in treat_directory() tried to handle
both excludes and untracked files with the same logic, and mishandled
both the excludes and the untracked files in the process, in different
ways.  Split that logic apart, and then focus on the logic for the
excludes; a subsequent commit will address the logic for untracked
files.

For show_other_directories, an excluded directory means that
every path underneath that directory will also be excluded.  Given that
the calling code requested to just show directories when everything
under a directory had the same state (that's what the
"DIR_SHOW_OTHER_DIRECTORIES" flag means), we generally do not need to
traverse into such directories and can just immediately mark them as
ignored (i.e. as path_excluded).  The only reason we cannot just
immediately return path_excluded is the DIR_HIDE_EMPTY_DIRECTORIES flag
and the possibility that the ignored directory is an empty directory.
The code previously treated DIR_SHOW_IGNORED_TOO in most cases as an
exception as well, which was wrong.  It can sometimes reduce the number
of cases where we need to recurse (namely if
DIR_SHOW_IGNORED_TOO_MODE_MATCHING is also set), but should not be able
to increase the number of cases where we need to recurse.  Fix the logic
accordingly.

Some sidenotes about possible confusion with dir.c:

* "ignored" often refers to an untracked ignore", i.e. a file which is
  not tracked which matches one of the ignore/exclusion rules.  But you
  can also have a "tracked ignore", a tracked file that happens to match
  one of the ignore/exclusion rules and which dir.c has to worry about
  since "git ls-files -c -i" is supposed to list them.

* The dir code often uses "ignored" and "excluded" interchangeably,
  which you need to keep in mind while reading the code.

* "exclude" is used multiple ways in the code:

  * As noted above, "exclude" is often a synonym for "ignored".

  * The logic for parsing .gitignore files was re-used in
    .git/info/sparse-checkout, except there it is used to mark paths that
    the user wants to *keep*.  This was mostly addressed by commit
    65edd96aec ("treewide: rename 'exclude' methods to 'pattern'",
    2019-09-03), but every once in a while you'll find a comment about
    "exclude" referring to these patterns that might in fact be in use
    by the sparse-checkout machinery for inclusion rules.

  * The word "EXCLUDE" is also used for pathspec negation, as in
      (pathspec->items[3].magic & PATHSPEC_EXCLUDE)
    Thus if a user had a .gitignore file containing
      *~
      *.log
      !settings.log
    And then ran
      git add -- 'settings.*' ':^settings.log'
    Then :^settings.log is a pathspec negation making settings.log not
    be requested to be added even though all other settings.* files are
    being added.  Also, !settings.log in the gitignore file is a negative
    exclude pattern meaning that settings.log is normally a file we
    want to track even though all other *.log files are ignored.

Sometimes it feels like dir.c needs its own glossary with its many
definitions, including the multiply-defined terms.

Reported-by: Jason Gore <Jason.Gore@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
a97c7a8bc4 t3001, t7300: add testcase showcasing missed directory traversal
In the last commit, we added a testcase showing that the directory
traversal machinery sometimes traverses into directories unnecessarily.
Here we show that there are cases where it does the opposite: it does
not traverse into directories, despite those directories having
important files that need to be flagged.

Add a testcase showing that `git ls-files -o -i --directory` can omit
some of the files it should be listing, and another showing that `git
clean -fX` can fail to clean out some of the expected files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
2e4e43a691 t7300: add testcase showing unnecessary traversal into ignored directory
The PNPM package manager is apparently creating deeply nested (but
ignored) directory structures; traversing them is costly
performance-wise, unnecessary, and in some cases is even throwing
warnings/errors because the paths are too long to handle on various
platforms.  Add a testcase that checks for such unnecessary directory
traversal.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
b338e9f668 ls-files: error out on -i unless -o or -c are specified
ls-files --ignored can be used together with either --others or
--cached.  After being perplexed for a bit and digging in to the code, I
assumed that ls-files -i was just broken and not printing anything and
I had a nice patch ready to submit when I finally realized that -i can be
used with --cached to find tracked ignores.

While that was a mistake on my part, and a careful reading of the
documentation could have made this more clear, I suspect this is an
error others are likely to make as well.  In fact, of two uses in our
testsuite, I believe one of the two did make this error.  In t1306.13,
there are NO tracked files, and all the excludes built up and used in
that test and in previous tests thus have to be about untracked files.
However, since they were looking for an empty result, the mistake went
unnoticed as their erroneous command also just happened to give an empty
answer.

-i will most the time be used with -o, which would suggest we could just
make -i imply -o in the absence of either a -o or -c, but that would be
a backward incompatible break.  Instead, let's just flag -i without
either a -o or -c as an error, and update the two relevant testcases to
specify their intent.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
7fe1ffdafa dir: report number of visited directories and paths with trace2
Provide more statistics in trace2 output that include the number of
directories and total paths visited by the directory traversal logic.
Subsequent patches will take advantage of this to ensure we do not
unnecessarily traverse into ignored directories.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:02 +09:00
7f9dd87922 dir: convert trace calls to trace2 equivalents
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:02 +09:00
e6f68f62e0 pretty: fix a typo in the documentation for %(trailers)
Signed-off-by: Louis Sautier <sautier.louis@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 07:47:51 +09:00
8c55753c68 Makefile: make PERL_DEFINES recursively expanded
Since 07d90eadb5 (Makefile: add Perl runtime prefix support,
2018-04-10) PERL_DEFINES has been a simply-expanded variable, let's
make it recursively expanded instead.

This change doesn't matter for the correctness of the logic. Whether
we used simply-expanded or recursively expanded didn't change what we
wrote out in GIT-PERL-DEFINES, but being consistent with other rules
makes this easier to understand.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 07:45:39 +09:00
00bc8390d8 remote-curl: fix clone on sha256 repos
The remote-https process needs to update it's own instance of
`the_repository' when it sees an HTTP(S) remote is using sha256.
Without this, parse_oid_hex() fails to handle sha256 OIDs when
it's eventually called by parse_fetch().

Tested with:

	git clone https://yhbt.net/sha256test.git
	GIT_SMART_HTTP=0 git clone https://yhbt.net/sha256test.git
	(plain http:// also works)

Cloning the URL via git:// required no changes

Signed-off-by: Eric Wong <e@80x24.org>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-12 12:14:44 +09:00
1e1c4c5eac ref-filter: fix read invalid union member bug
used_atom.u is an union, and it has different members depending on
what atom the auxiliary data the union part of the "struct
used_atom" wants to record. At most only one of the members can be
valid at any one time. Since the code checks u.remote_ref without
even making sure if the atom is "push" or "push:" (which are only
two cases that u.remote_ref.push becomes valid), but u.remote_ref
shares the same storage for other members of the union, the check
was reading from an invalid member, which was the bug.

Modify the condition here to check whether the atom name
equals to "push" or starts with "push:", to avoid reading the
value of invalid member of the union.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: further test fixes]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-12 08:13:14 +09:00
c5d0b12a4c maintenance: fix two memory leaks
Fixes two memory leaks when running `git maintenance start` or `git
maintenance stop` in `update_background_schedule`:

$ valgrind --leak-check=full ~/git/bin/git maintenance start
==76584== Memcheck, a memory error detector
==76584== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==76584== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==76584== Command: /home/lenaic/git/bin/git maintenance start
==76584==
==76584==
==76584== HEAP SUMMARY:
==76584==     in use at exit: 34,880 bytes in 252 blocks
==76584==   total heap usage: 820 allocs, 568 frees, 146,414 bytes allocated
==76584==
==76584== 65 bytes in 1 blocks are definitely lost in loss record 17 of 39
==76584==    at 0x483E6AF: malloc (vg_replace_malloc.c:306)
==76584==    by 0x3DC39C: xrealloc (wrapper.c:126)
==76584==    by 0x3992CC: strbuf_grow (strbuf.c:98)
==76584==    by 0x39A473: strbuf_vaddf (strbuf.c:392)
==76584==    by 0x39BC54: xstrvfmt (strbuf.c:979)
==76584==    by 0x39BD2C: xstrfmt (strbuf.c:989)
==76584==    by 0x18451B: update_background_schedule (gc.c:1977)
==76584==    by 0x1846F6: maintenance_start (gc.c:2011)
==76584==    by 0x1847B4: cmd_maintenance (gc.c:2030)
==76584==    by 0x127A2E: run_builtin (git.c:453)
==76584==    by 0x127E81: handle_builtin (git.c:704)
==76584==    by 0x128142: run_argv (git.c:771)
==76584==
==76584== 240 bytes in 1 blocks are definitely lost in loss record 29 of 39
==76584==    at 0x4840D7B: realloc (vg_replace_malloc.c:834)
==76584==    by 0x491CE5D: getdelim (in /usr/lib/libc-2.33.so)
==76584==    by 0x39ADD7: strbuf_getwholeline (strbuf.c:635)
==76584==    by 0x39AF31: strbuf_getdelim (strbuf.c:706)
==76584==    by 0x39B064: strbuf_getline_lf (strbuf.c:727)
==76584==    by 0x184273: crontab_update_schedule (gc.c:1919)
==76584==    by 0x184678: update_background_schedule (gc.c:1997)
==76584==    by 0x1846F6: maintenance_start (gc.c:2011)
==76584==    by 0x1847B4: cmd_maintenance (gc.c:2030)
==76584==    by 0x127A2E: run_builtin (git.c:453)
==76584==    by 0x127E81: handle_builtin (git.c:704)
==76584==    by 0x128142: run_argv (git.c:771)
==76584==
==76584== LEAK SUMMARY:
==76584==    definitely lost: 305 bytes in 2 blocks
==76584==    indirectly lost: 0 bytes in 0 blocks
==76584==      possibly lost: 0 bytes in 0 blocks
==76584==    still reachable: 34,575 bytes in 250 blocks
==76584==         suppressed: 0 bytes in 0 blocks
==76584== Reachable blocks (those to which a pointer was found) are not shown.
==76584== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==76584==
==76584== For lists of detected and suppressed errors, rerun with: -s
==76584== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-12 07:00:45 +09:00
df6c4f722c The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 15:27:23 +09:00
2cd6ce21f3 Merge branch 'zh/trailer-cmd'
The way the command line specified by the trailer.<token>.command
configuration variable receives the end-user supplied value was
both error prone and misleading.  An alternative to achieve the
same goal in a safer and more intuitive way has been added, as
the trailer.<token>.cmd configuration variable, to replace it.

* zh/trailer-cmd:
  trailer: add new .cmd config option
  docs: correct descript of trailer.<token>.command
2021-05-11 15:27:23 +09:00
416449eaba Merge branch 'jk/symlinked-dotgitx-cleanup'
Various test and documentation updates about .gitsomething paths
that are symlinks.

* jk/symlinked-dotgitx-cleanup:
  docs: document symlink restrictions for dot-files
  fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW
  t0060: test ntfs/hfs-obscured dotfiles
  t7450: test .gitmodules symlink matching against obscured names
  t7450: test verify_path() handling of gitmodules
  t7415: rename to expand scope
  fsck_tree(): wrap some long lines
  fsck_tree(): fix shadowed variable
  t7415: remove out-dated comment about translation
2021-05-11 15:27:23 +09:00
1af57f5d32 Merge branch 'jk/pack-objects-negative-options-fix'
Options to "git pack-objects" that take numeric values like
--window and --depth should not accept negative values; the input
validation has been tightened.

* jk/pack-objects-negative-options-fix:
  pack-objects: clamp negative depth to 0
  t5316: check behavior of pack-objects --depth=0
  pack-objects: clamp negative window size to 0
  t5300: check that we produced expected number of deltas
  t5300: modernize basic tests
2021-05-11 15:27:23 +09:00
270f8bfe00 Merge branch 'jk/doc-format-patch-skips-merges'
Document that "format-patch" skips merges.

* jk/doc-format-patch-skips-merges:
  docs/format-patch: mention handling of merges
2021-05-11 15:27:23 +09:00
0b77301bf4 Merge branch 'jc/test-allows-local'
Document that our test can use "local" keyword.

* jc/test-allows-local:
  CodingGuidelines: explicitly allow "local" for test scripts
2021-05-11 15:27:22 +09:00
74339f814c Merge branch 'nc/submodule-update-quiet'
"git submodule update --quiet" did not propagate the quiet option
down to underlying "git fetch", which has been corrected.

* nc/submodule-update-quiet:
  submodule update: silence underlying fetch with "--quiet"
2021-05-11 15:27:22 +09:00
5feebddd86 Merge branch 'js/merge-already-up-to-date-message-reword'
A few variants of informational message "Already up-to-date" has
been rephrased.

* js/merge-already-up-to-date-message-reword:
  merge: fix swapped "up to date" message components
  merge(s): apply consistent punctuation to "up to date" messages
2021-05-11 15:27:22 +09:00
8ca4771dd0 Merge branch 'rj/bisect-skip-honor-terms'
"git bisect skip" when custom words are used for new/old did not
work, which has been corrected.

* rj/bisect-skip-honor-terms:
  bisect--helper: use BISECT_TERMS in 'bisect skip' command
2021-05-11 15:27:22 +09:00
5f03e5126d refs: cleanup directories when deleting packed ref
When deleting a packed ref via 'update-ref -d', a lockfile is made in
the directory that would contain the loose copy of that ref, creating
any directories in the ref's path that do not exist. When the
transaction completes, the lockfile is deleted, but any empty parent
directories made when creating the lockfile are left in place.  These
empty directories are not removed by 'pack-refs' or other housekeeping
tasks and will accumulate over time.

When deleting a loose ref, we remove all empty parent directories at the
end of the transaction.

This commit applies the parent directory cleanup logic used when
deleting loose refs to packed refs as well.

Signed-off-by: Will Chandler <wfc@wfchandler.org>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 13:59:57 +09:00
64568c7171 describe tests: support -C in "check_describe"
Change a subshell added in a preceding commit to instead use a new
"-C" option to "check_describe". The idiom for this is copied as-is
from the "test_commit" function in test-lib-functions.sh

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:48:09 +09:00
33b4ae1114 describe tests: fix nested "test_expect_success" call
Fix a nested invocation of "test_expect_success", the
"check_describe()" function is a wrapper for calling
test_expect_success, and therefore needs to be called outside the body
of another "test_expect_success".

The two tests added in 30b1c7ad9d (describe: don't abort too early
when searching tags, 2020-02-26) were not testing for anything due to
this logic error. Without this fix reverting the C code changes in
that commit still has all tests passing, with this fix we're actually
testing the "describe" output. This is because "test_expect_success"
calls "test_finish_", whose last statement happens to be true.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:48:09 +09:00
7f07c1cdb7 describe tests: don't rely on err.actual from "check_describe"
Convert the one test that relied on the "err.actual" file produced by
check_describe() to instead do its own check of "git describe"
output.

This means that the two tests won't have an inter-dependency (e.g. if
the earlier test is skipped).

An earlier version of this patch instead asserted that no other test
had any output on stderr. We're not doing that here out of fear that
"gc --auto" or another future change to "git describe" will cause it
to legitimately emit output on stderr unexpectedly[1].

I'd think that inverting the test added in 3291fe4072 (Add
git-describe test for "verify annotated tag names on output",
2008-03-03) to make checking that we don't have warnings the rule
rather than the exception would be the sort of thing the describe
tests should be catching, but for now let's leave it as it is.

1. http://lore.kernel.org/git/xmqqwnuqo8ze.fsf@gitster.c.googlers.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:48:09 +09:00
a46a848296 describe tests: refactor away from glob matching
Change the glob matching via a "case" statement to a "test_cmp" after
we've stripped out the hash-specific g<hash-abbrev>
suffix. 5312ab11fb (Add describe test., 2007-01-13).

This means that we can use test_cmp to compare the output. I could
omit the "-8" change of e.g. "A-*" to "A-8-gHASH", but I think it
makes sense to test that here explicitly. It means you need to add new
tests to the bottom of the file, but that's not a burden in this case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:48:09 +09:00
2df546e17f describe tests: improve test for --work-tree & --dirty
Improve tests added in 9f67d2e827 (Teach "git describe" --dirty
option, 2009-10-21) and 2ed5c8e174 (describe: setup working tree for
--dirty, 2019-02-03) so that they make sense in combination with each
other.

The "check_describe" being removed here was the earlier test, we then
later added these --work-tree tests which really just wanted to check
if we got the exact same output from "describe", but the test wasn't
structured to test for that.

Let's change it to do that, which both improves test coverage and
makes it more obvious what's going on here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:48:09 +09:00
5d93460024 xdiff-interface: replace discard_hunk_line() with a flag
Remove the dummy discard_hunk_line() function added in
3b40a090fd (diff: avoid generating unused hunk header lines,
2018-11-02) in favor of having a new XDL_EMIT_NO_HUNK_HDR flag, for
use along with the two existing and similar XDL_EMIT_* flags.

Unlike the recently amended xdiff_emit_line_fn interface which'll be
called in a loop in xdl_emit_diff(), the hunk header is only emitted
once.

It makes more sense to pass this as a flag than provide a dummy
callback because that function may be able to skip doing certain work
if it knows the caller is doing nothing with the hunk header.

It would be possible to do so in the case of -U0 now, but the benefit
of doing so is so small that I haven't bothered. But this leaves the
door open to that, and more importantly makes the API use more
intuitive.

The reason we're putting a flag in the gap between 1<<0 and 1<<2 is
that the old 1<<1 flag was removed in 907681e940 (xdiff: drop
XDL_EMIT_COMMON, 2016-02-23) without re-ordering the remaining flags.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
22233d43eb xdiff users: use designated initializers for out_line
Amend the code added in 611e42a598 (xdiff: provide a separate emit
callback for hunks, 2018-11-02) to be more readable by using
designated initializers.

This changes "priv" in rerere.c to be initialized to NULL as we did in
merge-tree.c. That's not needed as we'll only use it if the callback
is defined, but being consistent here is better and less verbose.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
f97fe35857 pickaxe -G: don't special-case create/delete
Instead of special-casing creations and deletions let's just generate
a diff for them.

This logic of not running a diff under -G if we don't have both sides
dates back to the original implementation of -S in
52e9578985 ([PATCH] Introducing software archaeologist's tool
"pickaxe"., 2005-05-21).

In the case of -S we were not working with the xdiff interface and
needed to do this, but when -G was implemented in f506b8e8b5 (git
log/diff: add -G<regexp> that greps in the patch text, 2010-08-23)
this logic was diligently copied over.

But as the performance test added earlier in this series shows, this
does not make much of a difference. With:

    time GIT_TEST_LONG= GIT_PERF_REPEAT_COUNT=10 GIT_PERF_MAKE_OPTS='-j8 CFLAGS=-O3' ./run origin/next HEAD~ HEAD -- p4209-pickaxe.sh

With the HEAD~ commit being the preceding "pickaxe -G: terminate early
on matching lines" we get these results. Note that it's only the -G
codepaths that are relevant to this change:

    Test                                                                      origin/next       HEAD~                   HEAD
    -----------------------------------------------------------------------------------------------------------------------------------------
    4209.1: git log -S'int main' <limit-rev>..                                0.35(0.32+0.03)   0.35(0.33+0.02) +0.0%   0.35(0.30+0.05) +0.0%
    4209.2: git log -S'æ' <limit-rev>..                                       0.46(0.42+0.04)   0.46(0.41+0.05) +0.0%   0.46(0.42+0.04) +0.0%
    4209.3: git log --pickaxe-regex -S'(int|void|null)' <limit-rev>..         0.65(0.62+0.02)   0.64(0.61+0.02) -1.5%   0.64(0.60+0.04) -1.5%
    4209.4: git log --pickaxe-regex -S'if *\([^ ]+ & ' <limit-rev>..          0.52(0.45+0.06)   0.52(0.50+0.01) +0.0%   0.54(0.47+0.04) +3.8%
    4209.5: git log --pickaxe-regex -S'[àáâãäåæñøùúûüýþ]' <limit-rev>..       0.39(0.34+0.05)   0.39(0.34+0.04) +0.0%   0.39(0.36+0.03) +0.0%
    4209.6: git log -G'(int|void|null)' <limit-rev>..                         0.60(0.55+0.04)   0.58(0.54+0.03) -3.3%   0.58(0.49+0.08) -3.3%
    4209.7: git log -G'if *\([^ ]+ & ' <limit-rev>..                          0.61(0.52+0.06)   0.59(0.53+0.05) -3.3%   0.59(0.54+0.05) -3.3%
    4209.8: git log -G'[àáâãäåæñøùúûüýþ]' <limit-rev>..                       0.61(0.51+0.07)   0.58(0.54+0.04) -4.9%   0.57(0.51+0.06) -6.6%
    4209.9: git log -i -S'int main' <limit-rev>..                             0.36(0.31+0.04)   0.36(0.34+0.02) +0.0%   0.35(0.32+0.03) -2.8%
    4209.10: git log -i -S'æ' <limit-rev>..                                   0.36(0.33+0.03)   0.39(0.34+0.01) +8.3%   0.36(0.32+0.03) +0.0%
    4209.11: git log -i --pickaxe-regex -S'(int|void|null)' <limit-rev>..     0.83(0.77+0.05)   0.82(0.77+0.05) -1.2%   0.80(0.75+0.04) -3.6%
    4209.12: git log -i --pickaxe-regex -S'if *\([^ ]+ & ' <limit-rev>..      0.67(0.61+0.03)   0.64(0.61+0.03) -4.5%   0.63(0.61+0.02) -6.0%
    4209.13: git log -i --pickaxe-regex -S'[àáâãäåæñøùúûüýþ]' <limit-rev>..   0.40(0.37+0.02)   0.40(0.37+0.03) +0.0%   0.40(0.36+0.04) +0.0%
    4209.14: git log -i -G'(int|void|null)' <limit-rev>..                     0.58(0.51+0.07)   0.59(0.52+0.06) +1.7%   0.58(0.52+0.05) +0.0%
    4209.15: git log -i -G'if *\([^ ]+ & ' <limit-rev>..                      0.60(0.54+0.05)   0.60(0.54+0.06) +0.0%   0.60(0.56+0.03) +0.0%
    4209.16: git log -i -G'[àáâãäåæñøùúûüýþ]' <limit-rev>..                   0.58(0.51+0.06)   0.57(0.52+0.05) -1.7%   0.60(0.48+0.09) +3.4%

This small simplification really doesn't buy us much now, but I've got
plans to both convert the pickaxe code to using a PCREv2 backend[1]
and to implement additional pickaxe modes to do custom searches
through the diff[2]. Always having the diff available under -G is
going to help to simplify both of those changes.

1. https://lore.kernel.org/git/20210203032811.14979-22-avarab@gmail.com/
2. https://lore.kernel.org/git/20190424152215.16251-3-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
fa59e7beb2 pickaxe -G: terminate early on matching lines
Solve a long-standing item for "git log -Grx" of us e.g. finding "+
str" in the diff context and noting that we had a "hit", but xdiff
diligently continuing to generate and spew the rest of the diff at
us. This makes use of a new "early return" xdiff interface added by
preceding commits.

The TODO item (or, the NEEDSWORK comment) has been there since "git
log -G" was implemented. See f506b8e8b5 (git log/diff: add -G<regexp>
that greps in the patch text, 2010-08-23).

But now with the support added in the preceding changes to the
xdiff-interface we can return early. Let's assert the behavior of that
new early-return xdiff-interface by having a BUG() call here to die if
it ever starts handing us needless work again.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
9e20442298 xdiff-interface: allow early return from xdiff_emit_line_fn
Finish the change started in the preceding commit and allow an early
return from "xdiff_emit_line_fn" callbacks, this will allows
diffcore-pickaxe.c to save itself redundant work.

Our xdiff interface also had the limitation of not being able to abort
early since the beginning, see d9ea73e056 (combine-diff: refactor
built-in xdiff interface., 2006-04-05). Although at that time
"xdiff_emit_line_fn" was called "xdiff_emit_consume_fn", and
"xdiff_emit_hunk_fn" didn't exist yet.

There was some work in this area of xdiff-interface.[ch] recently with
3b40a090fd (diff: avoid generating unused hunk header lines,
2018-11-02) and 7c61e25fbf (diff: use hunk callback for word-diff,
2018-11-02).

In combination those two changes allow us to not do any work on the
hunks and diff at all, but didn't change the status quo with regards
to consumers that e.g. want the diff lines, but might want to abort
early.

Whereas now we can abort e.g. on the first "-line" of a 1000 line diff
if that's all we needed.

This interface is rather scary as noted in the comment to
xdiff-interface.h being added here, as noted there a future change
could add more exit codes, and hack xdl_emit_diff() and friends to
ignore or skip things more selectively as a result.

I did not see an inherent reason for why xdl_emit_{diffrec,record}()
could not be changed to ferry the "xdiff_emit_line_fn" error code
upwards instead of returning -1 on all "ret < 0".

But doing so would require corresponding changes in xdl_emit_diff(),
xdl_diff(). I didn't see any issue with narrowly doing that to
accomplish what I needed here, but it would leave xdiff's own return
values in an inconsistent state.

Instead I've left it at returning a more conventional (for git's own
codebase) 1 for an early return, and translating it (or rather, all
non-zero) to -1 for xdiff's consumption.

The reason for most of the "stop" complexity in xdiff_outf() is
because we want to be able to abort early, but do so in a way that
doesn't skip the appropriate strbuf_reset() invocations.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
a8d5eb6dc0 xdiff-interface: prepare for allowing early return
Change the function prototype of xdiff_emit_line_fn to return an "int"
instead of "void". Change all of those functions to "return 0",
nothing checks those return values yet, and no behavior is being
changed.

In subsequent commits the interface will be changed to allow early
return via this new return value.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
5b0672a26e pickaxe -S: slightly optimize contains()
When the "log -S<pat>" switch counts occurrences of <pat> on the
pre-image and post-image of a change. As soon as we know we had e.g. 1
before and 2 now we can stop, we don't need to keep counting past 2.

With this change a diff between A and B may have different performance
characteristics than between B and A. That's OK in this case, since
we'll emit the same output, and the effect is to make one of them
better.

I'm picking a check of "one" first on the assumption that it's a more
common case to have files grow over time than not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
5d35a9531c pickaxe: rename variables in has_changes() for brevity
Rename the {one,two}_contains variables to c{1,2}. This will make a
follow-up change easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
52e011cd2b pickaxe -S: support content with NULs under --pickaxe-regex
Fix a bug in the matching routine powering -S<rx> --pickaxe-regex so
that we won't abort early on content that has NULs in it.

We've had a hard requirement on REG_STARTEND since 2f8952250a (regex:
add regexec_buf() that can work on a non NUL-terminated string,
2016-09-21), but this sanity check dates back to d01d8c6782 (Support
for pickaxe matching regular expressions, 2006-03-29).

It wasn't needed anymore, and as the now-passing test shows, actively
getting in our way. Since we always require REG_STARTEND support we do
not need to stop at NULs. If we are dealing with a haystack with NUL
in it. The needle may be behind that NUL.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
2e197a7592 pickaxe: assert that we must have a needle under -G or -S
Assert early in diffcore_pickaxe() that we've got a needle to work
with under -G and -S.

This code is redundant to the check -G and -S get from
parse-options.c's get_arg(), which I'm adding a test for.

This check dates back to e1b161161d (diffcore-pickaxe: fix infinite
loop on zero-length needle, 2007-01-25) when "git log -S" could send
this code into an infinite loop.

It was then later refactored in 8fa4b09fb1 (pickaxe: hoist empty
needle check, 2012-10-28) into its current form, but it seemingly
wasn't noticed that in the meantime a move to the parse-options.c API
in dea007fb4c (diff: parse separate options like -S foo, 2010-08-05)
had made it redundant.

Let's retain some of the paranoia here with a BUG(), but there's no
need to be checking this in the pickaxe_match() inner loop.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
03c1f14acf pickaxe: refactor function selection in diffcore-pickaxe()
It's hard to read this codepath at a glance and reason about exactly
what combination of -G and -S will compile either regexes or kwset,
and whether we'll then dispatch to "diff_grep" or "has_changes".

Then in the "--find-object" case we aren't using the callback
function, but were previously passing down "has_changes".

Refactor this code to exhaustively check "opts", it's now more obvious
what callback function (or none) we want under what mode.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
d90d441c33 perf: add performance test for pickaxe
Add a test for the -G and -S pickaxe options and related options.

This test supports being run with GIT_TEST_LONG=1 to adjust the limit
on the number of commits from 1k to 10k. The 1k limit seems to hit a
good spot on git.git

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
102fdd2e07 pickaxe/style: consolidate declarations and assignments
Refactor contains() to do its assignments at the same time that it
does its declarations.

This code could have been refactored in ef90ab66e8 (pickaxe: use
textconv for -S counting, 2012-10-28) when a function call between the
declarations and assignments was removed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
a47fcbe6e4 diff.h: move pickaxe fields together again
Move the pickaxe and pickaxe_opts fields next to each other again. In
a past life they'd been on adjacent lines, but when they got moved
from a global variable to the diff_options struct in 6b5ee137e5 (Diff
clean-up., 2005-09-21) they got split apart.

That split made sense at the time, the "char*" and "int" (flags)
options were being grouped, but we've long since abandoned that
pattern in the diff_options struct, and now it makes more sense to
group these together again.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
d26ec88009 pickaxe: die when --find-object and --pickaxe-all are combined
Neither the --pickaxe-all documentation nor --find-object's has ever
suggested that you can combine the two. See f506b8e8b5 (git log/diff:
add -G<regexp> that greps in the patch text, 2010-08-23) and
15af58c1ad (diffcore: add a pickaxe option to find a specific blob,
2018-01-04).

But we've silently tolerated it, which makes the logic in
diffcore_pickaxe() harder to reason about. Let's assert that we won't
have the two combined.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
188e9e28c5 pickaxe: die when -G and --pickaxe-regex are combined
When the -G and --pickaxe-regex options are combined we simply ignore
the --pickaxe-regex option. Let's die instead as suggested by our
documentation, since -G is always a regex.

When --pickaxe-regex was added in d01d8c6782 (Support for pickaxe
matching regular expressions, 2006-03-29) only the -S option
existed. Then when -G was added in f506b8e8b5 (git log/diff: add
-G<regexp> that greps in the patch text, 2010-08-23) neither the
documentation for --pickaxe-regex was updated accordingly, nor was
something like this assertion added.

Since 5bc3f0b567 (diffcore-pickaxe doc: document -S and -G properly,
2013-05-31) we've claimed that --pickaxe-regex should only be used
with -S, but have silently tolerated combining it with -G, let's die
instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
7cd5d5b299 pickaxe tests: add missing test for --no-pickaxe-regex being an error
Add a missing test for --no-pickaxe-regex. This has been an error ever
since before the -S or -G options were added, or since
7ae0b0cb65 (git-log (internal): more options., 2006-03-01).

The reason for adding this test is that Junio suggested in [1] in
response to a later test addition in this series that it might be good
to support --no-pickaxe-regex in combination with -G. This would allow
for fixed-string searching with -G, similr to grep's --fixed-strings
mode.

I agree that that would make sense if anyone would like to implement
it, but since it dies right now let's first add this test to assert
the existing long-standing behavior. We can always add support for
--[no-]pickaxe-regex in combination with -G at some later date.

1. http://lore.kernel.org/git/xmqqwnto9pt7.fsf@gitster.g

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
064952fc34 pickaxe tests: test for -G, -S and --find-object incompatibility
Add a test for the options sanity check added in 5e505257f2 (diff:
properly error out when combining multiple pickaxe options,
2018-01-04).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
69ae93089c pickaxe tests: add test for "log -S" not being a regex
No test in our test suite checked for "log -S<pat>" being a fixed
string, as opposed to "log -S<pat> --pickaxe-regex". Let's test for
it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
c960939858 pickaxe tests: add test for diffgrep_consume() internals
In diffgrep_consume() we generate a diff, and then advance past the
"+" or "-" at the start of the line for matching. This has been done
ever since the code was added in f506b8e8b5 (git log/diff: add
-G<regexp> that greps in the patch text, 2010-08-23).

If we match "line" instead of "line + 1" no tests fail, i.e. we've got
zero coverage for whether any of our searches match the beginning of
the line or not. Let's add a test for this.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
6d0a40166e pickaxe tests: refactor to use test_commit --append --printf
Refactor the existing tests added in e0e7cb8080 (log -G: ignore
binary files, 2018-12-14) to use the --append option I added in
3373518cc8 (test-lib functions: add an --append option to
test_commit, 2021-01-12) and the --printf option added as part of an
in-flight topic of mine this commit depends on.

While I'm at it change some of the setup of the test to use a more
sensible pattern, e.g. setting up a temporary repo instead of creating
an orphan branch.

Since the -G and -S options will behave the same way with truncated
and removed content also change the "git rm" to emptying data.bin,
that's just catering to how test_commit works. The resulting test is
shorter.

See also f5d79bf7dd (tests: refactor a few tests to use "test_commit
--append", 2021-01-12) for prior similar refactoring.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
ecbff141a1 grep/pcre2 tests: reword comments referring to kwset
The kwset optimization has not been used by grep since
48de2a768c (grep: remove the kwset optimization, 2019-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
f0d4d398e2 test-lib: split up and deprecate test_create_repo()
Remove various redundant or obsolete code from the test_create_repo()
function, and split up its use in test-lib.sh from what tests need
from it.

This leave us with a pass-through wrapper for "git init" in
test-lib-functions.sh, in test-lib.sh we have the same, except for
needing to redirect stdout/stderr, and emitting an error ourselves if
it fails. We don't need to error() ourselves when test_create_repo()
is invoked, as the invocation will be a part of a test's "&&"-chain.

Everything below this paragraph is a detailed summary of the history
of test_create_repo() explaining why it's safe to remove the various
things it was doing:

 1. "mkdir -p" isn't needed because "git init" itself will create
    leading directories if needed.

 2. Since we're now a simple wrapper for "git init" we don't need to
    check that we have only one argument. If someone wants to run
    "test_create_repo --bare x" that's OK.

 3. We won't ever hit that "Cannot setup test environment"
    error.

    Checking the test environment sanity when doing "git init" dates
    back to eea420693b (t0000: catch trivial pilot errors.,
    2005-12-10) and 2ccd2027b0 (trivial: check, if t/trash directory
    was successfully created, 2006-01-05).

    We can also see it in another form a bit later in my own
    0d314ce834 (test-lib: use subshell instead of cd $new && .. && cd
    $old, 2010-08-30).

    But since 2006f0adae (t/test-lib: make sure Git has already been
    built, 2012-09-17) we already check if we have a built git
    earlier.

    The one thing this was testing after that 2012 change was that
    we'd just built "git", but not "git-init", but since
    3af4c7156c (tests: respect GIT_TEST_INSTALLED when initializing
    repositories, 2018-11-12) we invoke "git", not "git-init".

    So all of that's been checked already, and we don't need to
    re-check it here.

 4. We don't need to move .git/hooks out of the way.

    That dates back to c09a69a83e (Disable hooks during tests.,
    2005-10-16), since then hooks became disabled by default in
    f98f8cbac0 (Ship sample hooks with .sample suffix, 2008-06-24).

    So the hooks were already disabled by default, but as can be seen
    from "mkdir .git/hooks" changes various tests needed to re-setup
    that directory. Now they no longer do.

    This makes us implicitly depend on the default hooks being
    disabled, which is a good thing. If and when we'd have any
    on-by-default hooks (I see no reason we ever would) we'd want to
    see the subtle and not so subtle ways that would break the test
    suite.

 5. We don't need to "cd" to the "$repo" directory at all anymore.

    In the code being removed here we both "cd"'d to the repository
    before calling "init", and did so in a subshell.

    It's not important to do either, so both of those can be
    removed. We cd'd because this code grew from test-lib.sh code
    where we'd have done so already, see eedf8f97e5 (Abstract
    test_create_repo out for use in tests., 2006-02-17), and later
    "cd"'d inside a subshell since 0d314ce834 to avoid having to keep
    track of an "old pwd" variable to cd back after the setup.

    Being in the repository directory made moving the hooks around
    easier (we wouldn't have to fully qualify the path). Since we're
    not moving the hooks per #4 above we don't need to "cd" for that
    reason either.

 6. We can drop the --template argument and instead rely on the
    GIT_TEMPLATE_DIR set to the same path earlier in test-lib.sh. See
    8683a45d66 (Introduce GIT_TEMPLATE_DIR, 2006-12-19)

 7. We only needed that ">&3 2>&4" redirection when invoked from
    test-lib.sh.

    We could still invoke test_create_repo() there, but as the
    invocation is now trivial and we don't have a good reason to use
    test_create_repo() elsewhere let's call "git init" there
    ourselves.

 8. We didn't need to resolve "git" as
    "${GIT_TEST_INSTALLED:-$GIT_EXEC_PATH}/git$X" in test_create_repo(),
    even for the use of test-lib.sh

    PATH is already set up in test-lib.sh to start with
    GIT_TEST_INSTALLED and/or GIT_EXEC_PATH before
    test_create_repo() (now "git init") is called.. So we can simply
    run "git" and rely on the PATH lookup choosing the right
    executable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:19 +09:00
97c8aac9c5 test-lib: do not show advice about init.defaultBranch under --verbose
Arrange for the advice about naming the initial branch not to be shown
in the --verbose output of the test suite.

Since 675704c74d (init: provide useful advice about
init.defaultBranch, 2020-12-11) some tests have been very chatty with
repeated occurrences of this multi-line advice. Having it be this
verbose isn't helpful for anyone in the context of git's own test
suite, and it makes debugging tests that use their own "git init"
invocations needlessly distracting.

By setting the GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME variable early in
test-lib.sh itself we'll squash the warning not only for
test_create_repo(), as 675704c74d explicitly intended, but also for
other "git init" invocations.

And once we'd like to have this configuration set for all "git init"
invocations in the test suite we can get rid of the init.defaultBranch
configuration setting in test_create_repo(), as
repo_default_branch_name() in refs.c will take the GIT_TEST_* variable
over it being set.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:18 +09:00
04d12d6590 test-lib: reformat argument list in test_create_repo()
Reformat an argument list changed in 675704c74d (init: provide useful
advice about init.defaultBranch, 2020-12-11) to have the "-c" on the
same line as the argument it sets. This whitespace-only change makes
it easier to review a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:18 +09:00
ba7d318504 submodule tests: use symbolic-ref --short to discover branch name
Change a use of $GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME added in
704fed9ea2 (tests: start moving to a different default main branch
name, 2020-10-23) to simply discover the initial branch name of a
repository set up in this function with "symbolic-ref --short".

That's something done in another test in 704fed9ea2, so doing it like
this seems like an omission, or rather an overly eager
search/replacement instead of fixing the test logic.

There are only three uses of the GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
variable in the test suite, this gets rid of one of those.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:18 +09:00
47c88d16ba test-lib functions: add --printf option to test_commit
Add a --printf option to test_commit to allow writing to the file with
"printf" instead of "echo".

This is useful for writing "\n", "\0" etc., in particular in
combination with the --append option added in 3373518cc8 (test-lib
functions: add an --append option to test_commit, 2021-01-12).

I'm converting a few tests to use the new option rather than a manual
printf/add/commit combination to demonstrate its usefulness. While I'm
at it use "test_create_repo" where appropriate, and give the
first/second commit a meaningful/more conventional log message in
cases where no test cared about that message.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:18 +09:00
8cfe386b78 describe tests: convert setup to use test_commit
Convert the setup of the describe tests to use test_commit when
possible. This makes use of the new --annotate option to test_commit.

Some of the setup here could simply be removed since the data being
created wasn't important to any of the subsequent tests, so I've done
so. E.g. assigning to the "one" variable was always useless, and just
checking that we can describe HEAD after the first commit wasn't
useful.

In the case of the "two" variable we could instead use the tag we just
created. See 5312ab11fb (Add describe test., 2007-01-13) for the
initial version of this code. There's other cases here like redundant
"test_tick" invocations, or the simplification of not echoing "X" to a
file we're about to tag as "x", now we just use "x" in both cases.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:18 +09:00
6cf8d96fa2 test-lib functions: add an --annotated option to "test_commit"
Add an --annotated option to test_commit to create annotated tags. The
tag will share the same message as the commit, and we'll call
test_tick before creating it (unless --notick) is provided.

There's quite a few tests that could be simplified with this
construct. I've picked one to convert in this change as a
demonstration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:18 +09:00
5144219b7d test-lib-functions: document test_commit --no-tag
In 76b8b8d05c (test-lib functions: document arguments to test_commit,
2021-01-12) I added missing documentation to test_commit, but in less
than a month later in 3803a3a099 (t: add --no-tag option to
test_commit, 2021-02-09) we got another undocumented option. Let's fix
that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:17 +09:00
cb8fb7f861 test-lib-functions: reword "test_commit --append" docs
Reword the documentation for "test_commit --append" added in my
3373518cc8 (test-lib functions: add an --append option to test_commit,
2021-01-12).

A follow-up commit will make the "echo" part of this configurable, and
in any case saying "echo >>" rather than ">>" was redundant.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:17 +09:00
b57913f205 test-lib tests: remove dead GIT_TEST_FRAMEWORK_SELFTEST variable
Stop setting the GIT_TEST_FRAMEWORK_SELFTEST variable. This was originally needed
back in 4231d1ba99 (t0000: do not get self-test disrupted by
environment warnings, 2018-09-20).

It hasn't been needed since I deleted the relevant code in test-lib.sh
in c0eedbc009 (test-lib: remove check_var_migration, 2021-02-09), I
just didn't notice that it was set here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:17 +09:00
edc23840b0 test-lib: bring $remove_trash out of retirement
There's no point in creating a repository or directory only to decide
right afterwards that we're skipping all the tests. We can save
ourselves the redundant "git init" or "mkdir" and "rm -rf" in this
case.

We carry around the "$remove_trash" variable because if the directory
is unexpectedly gone at test_done time we'll still want to hit the
"trash directory already removed" error, but not if we never created
the trash directory. See df4c0d1a79 (test-lib: abort when can't
remove trash directory, 2017-04-20) for the addition of that error.

So let's partially revert 06478dab4c (test-lib: retire $remove_trash
variable, 2017-04-23) and move the decision about whether to skip all
tests earlier.

Let's also fix a bug that was with us since abc5d372ec (Enable
parallel tests, 2008-08-08): we would leak $remove_trash from the
environment. We don't want this to error out, so let's reset it to the
empty string first:

     remove_trash=t GIT_SKIP_TESTS=t0001 ./t0001-init.sh

I tested this with --debug, see 4d0912a206 (test-lib.sh: do not barf
under --debug at the end of the test, 2017-04-24) for a bug we don't
want to re-introduce.

While I'm at it, let's move the HOME assignment to just before
test_create_repo, it could be lower, but it seems better to set it
before calling anything in test-lib-functions.sh

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:45:17 +09:00
0e59f7ad67 merge-ort: split "distinct types" message into two translatable messages
The word "renamed" has two possible translations in many European
languages depending on whether one thing was renamed or two things were
renamed. Give translators freedom to alter any part of the message to
make it sound right in their language.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:26:01 +09:00
49f38e2de4 The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 16:59:47 +09:00
a0f521b56c Merge branch 'rs/repack-without-loosening-promised-objects'
"git repack -A -d" in a partial clone unnecessarily loosened
objects in promisor pack.

* rs/repack-without-loosening-promised-objects:
  repack: avoid loosening promisor objects in partial clones
2021-05-10 16:59:47 +09:00
44ccb7629a Merge branch 'ls/subtree'
"git subtree" updates.

* ls/subtree: (30 commits)
  subtree: be stricter about validating flags
  subtree: push: allow specifying a local rev other than HEAD
  subtree: allow 'split' flags to be passed to 'push'
  subtree: allow --squash to be used with --rejoin
  subtree: give the docs a once-over
  subtree: have $indent actually affect indentation
  subtree: don't let debug and progress output clash
  subtree: add comments and sanity checks
  subtree: remove duplicate check
  subtree: parse revs in individual cmd_ functions
  subtree: use "^{commit}" instead of "^0"
  subtree: don't fuss with PATH
  subtree: use "$*" instead of "$@" as appropriate
  subtree: use more explicit variable names for cmdline args
  subtree: use git-sh-setup's `say`
  subtree: use `git merge-base --is-ancestor`
  subtree: drop support for git < 1.7
  subtree: more consistent error propagation
  subtree: don't have loose code outside of a function
  subtree: t7900: add porcelain tests for 'pull' and 'push'
  ...
2021-05-10 16:59:47 +09:00
aaa3c8065d Merge branch 'bc/hash-transition-interop-part-1'
SHA-256 transition.

* bc/hash-transition-interop-part-1:
  hex: print objects using the hash algorithm member
  hex: default to the_hash_algo on zero algorithm value
  builtin/pack-objects: avoid using struct object_id for pack hash
  commit-graph: don't store file hashes as struct object_id
  builtin/show-index: set the algorithm for object IDs
  hash: provide per-algorithm null OIDs
  hash: set, copy, and use algo field in struct object_id
  builtin/pack-redundant: avoid casting buffers to struct object_id
  Use the final_oid_fn to finalize hashing of object IDs
  hash: add a function to finalize object IDs
  http-push: set algorithm when reading object ID
  Always use oidread to read into struct object_id
  hash: add an algo member to struct object_id
2021-05-10 16:59:46 +09:00
59b519ab7e am: learn to process quoted lines that ends with CRLF
In previous changes, mailinfo has learnt to process lines that decoded
from base64 or quoted-printable, and ends with CRLF.

Let's teach "am" that new trick, too.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
133a4fda59 mailinfo: allow stripping quoted CR without warning
In previous changes, we've turned on warning for quoted CR in base64 or
quoted-printable email messages. Some projects see those quoted CR a lot,
they know that it happens most of the time, and they find it's desirable
to always strip those CR.

Those projects in question usually fall back to use other tools to handle
patches when receive such patches.

Let's help those projects handle those patches by stripping those
excessive CR.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
f1aa299443 mailinfo: allow squelching quoted CRLF warning
In previous change, Git starts to warn for quoted CRLF in decoded
base64/QP email. Despite those warnings are usually helpful,
quoted CRLF could be part of some users' workflow.

Let's give them an option to turn off the warning completely.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
0b689562ca mailinfo: warn if CRLF found in decoded base64/QP email
When SMTP servers receive 8-bit email messages, possibly with only
LF as line ending, some of them decide to change said LF to CRLF.

Some mailing list softwares, when receive 8-bit email messages,
decide to encode those messages in base64 or quoted-printable.

If an email is transfered through above mail servers, then distributed
by such mailing list softwares, the recipients will receive an email
contains a patch mungled with CRLF encoded inside another encoding.

Thus, such CR (in CRLF) couldn't be dropped by "mailsplit".
Hence, the mailed patch couldn't be applied cleanly.
Such accidents have been observed in the wild [1].

Instead of silently rejecting those messages, let's give our users
some warnings if such CR (as part of CRLF) is found.

[1]: https://nmbug.notmuchmail.org/nmweb/show/m2lf9ejegj.fsf%40guru.guru-group.fi

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
8c9ca6f095 pretty-formats.txt: add missing space
The description of "%ch" is missing a space after "human style", before
the parenthetical remark. This description was introduced in b722d4560e
("pretty: provide human date format", 2021-04-25). That commit also
added "%ah", which does have the space already.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 14:12:49 +09:00
fba8e4c3d0 git-repack.txt: remove spurious ")"
Drop the ")" at the end of this paragraph. There's a parenthetical
remark in this paragraph, but it's been closed on the line above.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 14:12:47 +09:00
2d677e5b15 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 12:47:42 +09:00
39c5392d68 Merge branch 'll/clone-reject-shallow'
Fix tests when forced to use v0 protocol.

* ll/clone-reject-shallow:
  t5601: mark protocol v2-only test
2021-05-07 12:47:42 +09:00
70a890d42f Merge branch 'si/zsh-complete-comment-fix'
Portability fix for command line completion script (in contrib/).

* si/zsh-complete-comment-fix:
  work around zsh comment in __git_complete_worktree_paths
2021-05-07 12:47:42 +09:00
18e1ba1092 Merge branch 'dl/complete-stash-updates'
Further update the command line completion (in contrib/) for "git
stash".

* dl/complete-stash-updates:
  git-completion.bash: consolidate cases in _git_stash()
  git-completion.bash: use $__git_cmd_idx in more places
  git-completion.bash: rename to $__git_cmd_idx
  git-completion.bash: separate some commands onto their own line
2021-05-07 12:47:41 +09:00
848a17c274 Merge branch 'dl/complete-stash'
The command line completion (in contrib/) for "git stash" has been
updated.

* dl/complete-stash:
  git-completion.bash: use __gitcomp_builtin() in _git_stash()
  git-completion.bash: extract from else in _git_stash()
  git-completion.bash: pass $__git_subcommand_idx from __git_main()
2021-05-07 12:47:41 +09:00
936e58851a Merge branch 'ah/plugleaks'
Plug various leans reported by LSAN.

* ah/plugleaks:
  builtin/rm: avoid leaking pathspec and seen
  builtin/rebase: release git_format_patch_opt too
  builtin/for-each-ref: free filter and UNLEAK sorting.
  mailinfo: also free strbuf lists when clearing mailinfo
  builtin/checkout: clear pending objects after diffing
  builtin/check-ignore: clear_pathspec before returning
  builtin/bugreport: don't leak prefixed filename
  branch: FREE_AND_NULL instead of NULL'ing real_ref
  bloom: clear each bloom_key after use
  ls-files: free max_prefix when done
  wt-status: fix multiple small leaks
  revision: free remainder of old commit list in limit_list
2021-05-07 12:47:41 +09:00
8585d6c04a Merge branch 'ps/rev-list-object-type-filter'
"git rev-list" learns the "--filter=object:type=<type>" option,
which can be used to exclude objects of the given kind from the
packfile generated by pack-objects.

* ps/rev-list-object-type-filter:
  rev-list: allow filtering of provided items
  pack-bitmap: implement combined filter
  pack-bitmap: implement object type filter
  list-objects: implement object type filter
  list-objects: support filtering by tag and commit
  list-objects: move tag processing into its own function
  revision: mark commit parents as NOT_USER_GIVEN
  uploadpack.txt: document implication of `uploadpackfilter.allow`
2021-05-07 12:47:41 +09:00
826ef0e5e5 Merge branch 'ab/svn-tests-set-e-fix'
Test clean-up.

* ab/svn-tests-set-e-fix:
  svn tests: refactor away a "set -e" in test body
  svn tests: remove legacy re-setup from init-clone test
2021-05-07 12:47:40 +09:00
0377ac98dc Merge branch 'ab/rebase-no-reschedule-failed-exec'
"git rebase --[no-]reschedule-failed-exec" did not work well with
its configuration variable, which has been corrected.

* ab/rebase-no-reschedule-failed-exec:
  rebase: don't override --no-reschedule-failed-exec with config
  rebase tests: camel-case rebase.rescheduleFailedExec consistently
2021-05-07 12:47:40 +09:00
5a357fa477 Merge branch 'ab/doc-lint'
Dev support.

* ab/doc-lint:
  docs: fix linting issues due to incorrect relative section order
  doc lint: lint relative section order
  doc lint: lint and fix missing "GIT" end sections
  doc lint: fix bugs in, simplify and improve lint script
  doc lint: Perl "strict" and "warnings" in lint-gitlink.perl
  Documentation/Makefile: make doc.dep dependencies a variable again
  Documentation/Makefile: make $(wildcard howto/*.txt) a var
2021-05-07 12:47:40 +09:00
fe069dce62 Merge branch 'mt/add-rm-in-sparse-checkout'
"git add" and "git rm" learned not to touch those paths that are
outside of sparse checkout.

* mt/add-rm-in-sparse-checkout:
  rm: honor sparse checkout patterns
  add: warn when asked to update SKIP_WORKTREE entries
  refresh_index(): add flag to ignore SKIP_WORKTREE entries
  pathspec: allow to ignore SKIP_WORKTREE entries on index matching
  add: make --chmod and --renormalize honor sparse checkouts
  t3705: add tests for `git add` in sparse checkouts
  add: include magic part of pathspec on --refresh error
2021-05-07 12:47:40 +09:00
e706aaf3bc Merge branch 'ps/config-global-override'
Replace GIT_CONFIG_NOSYSTEM mechanism to decline from reading the
system-wide configuration file with GIT_CONFIG_SYSTEM that lets
users specify from which file to read the system-wide configuration
(setting it to an empty file would essentially be the same as
setting NOSYSTEM), and introduce GIT_CONFIG_GLOBAL to override the
per-user configuration in $HOME/.gitconfig.

* ps/config-global-override:
  t1300: fix unset of GIT_CONFIG_NOSYSTEM leaking into subsequent tests
  config: allow overriding of global and system configuration
  config: unify code paths to get global config paths
  config: rename `git_etc_config()`
2021-05-07 12:47:39 +09:00
f16a4660de Merge branch 'zh/pretty-date-human'
"git log --format=..." placeholders learned %ah/%ch placeholders to
request the --date=human output.

* zh/pretty-date-human:
  pretty: provide human date format
2021-05-07 12:47:39 +09:00
c108c8c2f2 Merge branch 'zh/format-ref-array-optim'
"git (branch|tag) --format=..." has been micro-optimized.

* zh/format-ref-array-optim:
  ref-filter: reuse output buffer
  ref-filter: get rid of show_ref_array_item
2021-05-07 12:47:39 +09:00
bb2feec17f Merge branch 'ad/cygwin-no-backslashes-in-paths'
Cygwin pathname handling fix.

* ad/cygwin-no-backslashes-in-paths:
  cygwin: disallow backslashes in file names
2021-05-07 12:47:39 +09:00
6d99f31dda Merge branch 'jz/apply-3way-first-message-fix'
When we swapped the order of --3way fallback, we forgot to adjust
the message we give when the first method fails and the second
method is attempted (which used to be "direct application failed
hence we try 3way", now it is the other way around).

* jz/apply-3way-first-message-fix:
  apply: adjust messages to account for --3way changes
2021-05-07 12:47:38 +09:00
6e08cbdf38 Merge branch 'jk/prune-with-bitmap-fix'
When the reachability bitmap is in effect, the "do not lose
recently created objects and those that are reachable from them"
safety to protect us from races were disabled by mistake, which has
been corrected.

* jk/prune-with-bitmap-fix:
  prune: save reachable-from-recent objects with bitmaps
  pack-bitmap: clean up include_check after use
2021-05-07 12:47:38 +09:00
e60e9cc20e Merge branch 'po/diff-patch-doc'
Doc update.

* po/diff-patch-doc:
  doc: point to diff attribute in patch format docs
2021-05-07 12:47:38 +09:00
a850356d1b Merge branch 'hn/trace-reflog-expiry'
The reflog expiry machinery has been taught to emit trace events.

* hn/trace-reflog-expiry:
  refs/debug: trace into reflog expiry too
2021-05-07 12:47:38 +09:00
e5d99d378b Merge branch 'ab/pretty-date-format-tests'
Tweak a few tests for "log --format=..." that show timestamps in
various formats.

* ab/pretty-date-format-tests:
  pretty tests: give --date/format tests a better description
  pretty tests: simplify %aI/%cI date format test
2021-05-07 12:47:38 +09:00
5f586f55a0 Merge branch 'ps/config-env-option-with-separate-value'
"git --config-env var=val cmd" weren't accepted (only
--config-env=var=val was).

* ps/config-env-option-with-separate-value:
  git: support separate arg for `--config-env`'s value
  git.txt: fix synopsis of `--config-env` missing the equals sign
2021-05-07 12:47:37 +09:00
3a7f0908b6 clean: remove unnecessary variable
The variable `matches` used to hold the return of a `dir_path_match()`
call that was removed in 95c11ecc73 ("Fix error-prone fill_directory()
API; make it only return matches", 2020-04-01). Now `matches` will
always hold 0, which is the value it's initialized with; and the
condition `matches != MATCHED_EXACTLY` will always evaluate to true. So
let's remove this unnecessary variable.

Interestingly, it seems that `matches != MATCHED_EXACTLY` was already
unnecessary before 95c11ecc73. That's because `remove_directories` is
always set to 1 when we have pathspecs; So, in the condition
`!remove_directories && matches != MATCHED_EXACTLY`, we would either:

- have pathspecs (or have been given `-d`) and ignore `matches` because
  `remove_directories` is 1; or

- not have pathspecs (nor `-d`) and end up just checking that
  `0 != MATCHED_EXACTLY`, as `matches` would never get reassigned
  after its zero initialization (because there is no pathspec to match).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 07:48:11 +09:00
dd9323b7fb mailinfo: stop parsing options manually
In a later change, mailinfo will learn more options, let's switch to our
robust parse_options framework before that step.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:40:26 +09:00
d582992e80 mailinfo: load default metainfo_charset lazily
In a later change, we will use parse_option to parse mailinfo's options.
In mailinfo, both "-u", "-n", and "--encoding" try to set the same
field, with "-u" reset that field to some default value from
configuration variable "i18n.commitEncoding".

Let's delay the setting of that field until we finish processing all
options. By doing that, "i18n.commitEncoding" can be parsed on demand.
More importantly, it cleans the way for using parse_option.

This change introduces some inconsistent brackets "{}" in "if/else if"
construct, however, we will rewrite them in the next few changes.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:40:25 +09:00
a1989cf7b8 add: die if both --dry-run and --interactive are given
The interactive machinery does not obey --dry-run. Die appropriately
if both flags are passed.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:14:04 +09:00
256c2dc42c perl: use mock i18n functions under NO_GETTEXT=Y
Change the logic of the i18n functions I added in 5e9637c629 (i18n:
add infrastructure for translating Git with gettext, 2011-11-18) to
use pass-through functions when NO_GETTEXT is defined.

This speeds up the compilation time of commands that use this library
when NO_GETTEXT=Y is in effect. Loading it and POSIX.pm is around 20ms
on my machine, whereas it takes 2ms to just instantiate perl itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:58:33 +09:00
368a50d9ee Makefile: regenerate *.pm on NO_PERL_CPAN_FALLBACKS change
Regenerate the *.pm files in perl/build/* if the
NO_PERL_CPAN_FALLBACKS flag added to the *.pm files in
1aca69c019 (perl Git::LoadCPAN: emit better errors under
NO_PERL_CPAN_FALLBACKS, 2018-03-03) is changed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:58:32 +09:00
3d49f7220a Makefile: regenerate perl/build/* if GIT-PERL-DEFINES changes
Change the logic to generate perl/build/* to regenerate those files if
GIT-PERL-DEFINES changes. This ensures that e.g. changing localedir
will result in correctly re-generated files.

I don't think that ever worked. The brokenness pre-dates my
20d2a30f8f (Makefile: replace perl/Makefile.PL with simple make
rules, 2017-12-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:58:30 +09:00
4070c9e09f Makefile: don't re-define PERL_DEFINES
Since 07d90eadb5 (Makefile: add Perl runtime prefix support,
2018-04-10) we have been declaring PERL_DEFINES right after assigning
to it, with the effect that the first PERL_DEFINES was ignored.

That bug didn't matter in practice since the first line had all the
same variables as the second, so we'd correctly re-generate
everything. It just made for confusing reading.

Let's remove that first assignment, and while we're at it split these
across lines to make them more maintainable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:58:27 +09:00
d4e2d15a8b streaming.c: move {open,close,read} from vtable to "struct git_istream"
Move the definition of the structure around the open/close/read
functions introduced in 46bf043807 (streaming: a new API to read from
the object store, 2011-05-11) to instead populate "close" and "read"
members in the "struct git_istream".

This gets us rid of an extra pointer deference, and I think makes more
sense. The "close" and "read" functions are the primary interface to
the stream itself.

Let's also populate a "open" callback in the same struct. That's now
used by open_istream() after istream_source() decides what "open"
function should be used. This isn't needed to get rid of the
"stream_vtbl" variables, but makes sense for consistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:10 +09:00
de94c0eace streaming.c: stop passing around "object_info *" to open()
Change the streaming interface to stop passing around the "struct
object_info" the open() functions.

As seen in 7ef2d9a260 (streaming: read non-delta incrementally from a
pack, 2011-05-13) which introduced the "st->u.in_pack" assignments
being changed here only the open_istream_pack_non_delta() path need
these.

So let's instead do this when preparing the selected callback in the
istream_source() function. This might also allow the compiler to
reduce the lifetime of the "oi" variable, as we've moved it from
"git_istream()" to "istream_source()".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:09 +09:00
bc062ad001 streaming.c: remove {open,close,read}_method_decl() macros
Remove the {open,close,read}_method_decl() macros added in
46bf043807 (streaming: a new API to read from the object store,
2011-05-11) in favor of inlining the definition of the arguments of
these functions.

Since we'll end up using them via the "{open,close,read}_istream_fn"
types we don't gain anything in the way of compiler checking by using
these macros, and as of preceding commits we no longer need to declare
these argument lists twice. So declaring them at a distance just
serves to make the code less readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:06 +09:00
0d9af06e36 streaming.c: remove enum/function/vtbl indirection
Remove the indirection of discovering a function pointer to use via an
enum and virtual table. This refactors code added in
46bf043807 (streaming: a new API to read from the object store,
2011-05-11).

We can instead simply return an "open_istream_fn" for use from the
"istream_source()" selector function directly. This allows us to get
rid of the "incore", "loose" and "pack_non_delta" enum
variables. We'll return the functions instead.

The "stream_error" variable in that enum can likewise go in favor of
returning NULL, which is what the open_istream() was doing when it got
that value anyway.

We can thus remove the entire enum, and the "open_istream_tbl" virtual
table that (indirectly) referenced it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:04 +09:00
b65528360f streaming.c: avoid forward declarations
Change code added in 46bf043807 (streaming: a new API to read from
the object store, 2011-05-11) to avoid forward declarations of the
functions it uses. We can instead move this code to the bottom of the
file, and thus avoid the open_method_decl() calls.

Aside from the addition of the "static helpers[...]" comment being
added here, and the removal of the forward declarations this is a
move-only change.

The style of the added "static helpers[...]"  comment isn't in line
with our usual coding style, but is consistent with several other
comments used in this file, so let's use that style consistently here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:02 +09:00
b79f9c075d sparse-index.c: remove set_index_sparse_config()
Remove the set_index_sparse_config() function by folding it into
set_sparse_index_config(), which was its only user.

Since 122ba1f7b5 (sparse-checkout: toggle sparse index from builtin,
2021-03-30) the flow of this code hasn't made much sense, we'd get
"enabled" in set_sparse_index_config(), proceed to call
set_index_sparse_config() with it.

There we'd call prepare_repo_settings() and set
"repo->settings.sparse_index = 1", only to needlessly call
prepare_repo_settings() again in set_sparse_index_config() (where it
would early abort), and finally setting "repo->settings.sparse_index =
enabled".

Instead we can just call prepare_repo_settings() once, and set the
variable to "enabled" in the first place.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:53:46 +09:00
6b79818bfb git-p4: speed up search for branch parent
For every new branch that git-p4 imports, it needs to find the commit
where it branched off its parent branch. While p4 doesn't record this
information explicitly, the first changelist on a branch is usually an
identical copy of the parent branch.

The method searchParent() tries to find a commit in the history of the
given "parent" branch whose tree exactly matches the initial changelist
of the new branch, "target". The code iterates through the parent
commits and compares each of them to this initial changelist using
diff-tree.

Since we already know the tree object name we are looking for, spawning
diff-tree for each commit is wasteful.

Use the "--format" option of "rev-list" to find out the tree object name
of each commit in the history, and find the tree whose name is exactly
the same as the tree of the target commit to optimize this.

This results in a considerable speed-up, at least on Windows. On one
Windows machine with a fairly large repository of about 16000 commits in
the parent branch, the current code takes over 7 minutes, while the new
code only takes just over 10 seconds for the same changelist:

Before:

    $ time git p4 sync
    Importing from/into multiple branches
    Depot paths: //depot
    Importing revision 31274 (100.0%)
    Updated branches: b1

    real    7m41.458s
    user    0m0.000s
    sys     0m0.077s

After:

    $ time git p4 sync
    Importing from/into multiple branches
    Depot paths: //depot
    Importing revision 31274 (100.0%)
    Updated branches: b1

    real    0m10.235s
    user    0m0.000s
    sys     0m0.062s

Signed-off-by: Joachim Kuebart <joachim.kuebart@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:51:33 +09:00
c3ab08844c git-p4: ensure complex branches are cloned correctly
When importing a branch from p4, git-p4 searches the history of the parent
branch for the branch point. The test for the complex branch structure
ensures all files have the expected contents, but doesn't examine the
branch structure.

Check for the correct branch structure by making sure that the initial
commit on each branch is empty. This ensures that the initial commit's
parent is indeed the correct branch-off point.

Signed-off-by: Joachim Kuebart <joachim.kuebart@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:51:31 +09:00
f91371b948 patience diff: remove unused variable
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 18:56:48 +09:00
204aa2d24d patience diff: remove unnecessary string comparisons
xdl_prepare_env() calls xdl_classify_record() which arranges for the
hashes of non-matching lines to be different so lines can be tested
for equality by comparing just their hashes.

This reduces the time taken to calculate the diff of v2.28.0 to
v2.29.0 by ~3-4%.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 18:56:48 +09:00
0324e8fc6b word diff: handle zero length matches
If find_word_boundaries() encounters a zero length match (which can be
caused by matching a newline or using '*' instead of '+' in the regex)
we stop splitting the input into words which generates an inaccurate
diff. To fix this increment the start point when there is a zero
length match and try a new match. This is safe as posix regular
expressions always return the longest available match so a zero length
match means there are no longer matches available from the current
position.

Commit bf82940dbf (color-words: enable REG_NEWLINE to help user,
2009-01-17) prevented matching newlines in negated character classes
but it is still possible for the user to have an explicit newline
match in the regex which could cause a zero length match.

One could argue that having explicit newline matches or using '*'
rather than '+' are user errors but it seems to be better to work
round them than produce inaccurate diffs.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 18:53:42 +09:00
87094fc2da ci: run test round with parallel-checkout enabled
We already have tests for the basic parallel-checkout operations. But
this code can also run be executed by other commands, such as
git-read-tree and git-sparse-checkout, which are currently not tested
with multiple workers. To promote a wider test coverage without
duplicating tests:

1. Add the GIT_TEST_CHECKOUT_WORKERS environment variable, to optionally
   force parallel-checkout execution during the whole test suite.

2. Set this variable (with a value of 2) in the second test round of our
   linux-gcc CI job. This round runs `make test` again with some
   optional GIT_TEST_* variables enabled, so there is no additional
   overhead in exercising the parallel-checkout code here.

Note that tests checking out less than two parallel-eligible entries
will fall back to the sequential mode. Nevertheless, it's still a good
exercise for the parallel-checkout framework as the fallback codepath
also writes the queued entries using the parallel-checkout functions
(only without spawning any worker).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:27:17 +09:00
d0e5d35700 parallel-checkout: add tests for basic operations
Add tests to populate the working tree during clone and checkout using
sequential and parallel mode, to confirm that they produce identical
results. Also test basic checkout mechanics, such as checking for
symlinks in the leading directories and the abidance to --force.

Note: some helper functions are added to a common lib file which is only
included by t2080 for now. But they will also be used by other
parallel-checkout tests in the following patches.

Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
d5904220bc parallel-checkout: add tests related to .gitattributes
Add tests to confirm that the `struct conv_attrs` data is correctly
passed from the main process to the workers, and that they can properly
convert the blobs before writing them to the working tree.

Also check that parallel-ineligible entries, such as regular files that
require external filters, are correctly smudge and written when
parallel-checkout is enabled.

Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
70b052b209 checkout-index: add parallel checkout support
Allow checkout-index to use the parallel checkout framework, honoring
the checkout.workers configuration.

There are two code paths in checkout-index which call
`checkout_entry()`, and thus, can make use of parallel checkout:
`checkout_file()`, which is used to write paths explicitly given at the
command line; and `checkout_all()`, which is used to write all paths in
the index, when the `--all` option is given.

In both operation modes, checkout-index doesn't abort immediately on a
`checkout_entry()` failure. Instead, it tries to check out all remaining
paths before exiting with a non-zero exit code. To keep this behavior
when parallel checkout is being used, we must allow
`run_parallel_checkout()` to try writing the queued entries before we
exit, even if we already got an error code from a previous
`checkout_entry()` call.

However, `checkout_all()` doesn't return on errors, it calls `exit()`
with code 128. We could make it call `run_parallel_checkout()` before
exiting, but it makes the code easier to follow if we unify the exit
path for both checkout-index modes at `cmd_checkout_index()`, and let
this function take care of the interactions with the parallel checkout
API. So let's do that.

With this change, we also have to consider whether we want to keep using
128 as the error code for `git checkout-index --all`, while we use 1 for
`git checkout-index <path>` (even when the actual error is the same).
Since there is not much value in having code 128 only for `--all`, and
there is no mention about it in the docs (so it's unlikely that changing
it will break any existing script), let's make both modes exit with code
1 on `checkout_entry()` errors.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
2fa3cbadcd t0028: extract encoding helpers to lib-encoding.sh
The following patch will add tests outside t0028 which will also need to
re-encode some strings. Extract the auxiliary encoding functions from
t0028 to a common lib file so that they can be reused.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
6a7bc9d118 parallel-checkout: add tests related to path collisions
Add tests to confirm that path collisions are properly detected by
checkout workers, both to avoid race conditions and to report colliding
entries on clone.

Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
6053950632 builtin/checkout.c: complete parallel checkout support
Pathspec-limited checkouts (like `git checkout *.txt`) are performed by
a code path that doesn't yet support parallel checkout because it calls
checkout_entry() directly, instead of unpack_trees(). Let's add parallel
checkout support for this code path too.

The transient cache entries allocated in checkout_merged() are now
allocated in a mem_pool which is only discarded after parallel checkout
finishes. This is done because the entries need to be valid when
run_parallel_checkout() is called.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:33 +09:00
9616882780 make_transient_cache_entry(): optionally alloc from mem_pool
Allow make_transient_cache_entry() to optionally receive a mem_pool
struct in which it should allocate the entry. This will be used in the
following patch, to store some transient entries which should persist
until parallel checkout finishes.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:25:25 +09:00
b89c731228 t5601: mark protocol v2-only test
A HTTP-clone test introduced in 4fe788b1b0 ("builtin/clone.c: add
--reject-shallow option", 2021-04-01) only works in protocol v2, but is
not marked as such.

The aforementioned patch implements --reject-shallow for a variety of
situations, but usage of a protocol that requires a remote helper is not
one of them. (Such an implementation would require extending the remote
helper protocol to support the passing of a "reject shallow" option, and
then teaching it to both protocol-speaking ends.)

For now, to make it pass when GIT_TEST_PROTOCOL_VERSION=0 is passed, add
"-c protocol.version=2". A more complete solution would be either to
augment the remote helper protocol to support this feature or to return
a fatal error when using --reject-shallow with a protocol that uses a
remote helper.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 10:54:41 +09:00
477673d6f3 send-pack: support push negotiation
Teach Git the push.negotiate config variable.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 10:41:29 +09:00
9c1e657a8f fetch: teach independent negotiation (no packfile)
Currently, the packfile negotiation step within a Git fetch cannot be
done independent of sending the packfile, even though there is at least
one application wherein this is useful. Therefore, make it possible for
this negotiation step to be done independently. A subsequent commit will
use this for one such application - push negotiation.

This feature is for protocol v2 only. (An implementation for protocol v0
would require a separate implementation in the fetch, transport, and
transport helper code.)

In the protocol, the main hindrance towards independent negotiation is
that the server can unilaterally decide to send the packfile. This is
solved by a "wait-for-done" argument: the server will then wait for the
client to say "done". In practice, the client will never say it; instead
it will cease requests once it is satisfied.

In the client, the main change lies in the transport and transport
helper code. fetch_refs_via_pack() performs everything needed - protocol
version and capability checks, and the negotiation itself.

There are 2 code paths that do not go through fetch_refs_via_pack() that
needed to be individually excluded: the bundle transport (excluded
through requiring smart_options, which the bundle transport doesn't
support) and transport helpers that do not support takeover. If or when
we support independent negotiation for protocol v0, we will need to
modify these 2 code paths to support it. But for now, report failure if
independent negotiation is requested in these cases.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 10:41:29 +09:00
15f3e1e056 t6423: rename file within directory that other side renamed
Add a new testcase where one side of history renames:
   olddir/ -> newdir/
and the other side of history renames:
   olddir/a -> olddir/alpha

When using merge.directoryRenames=true, it seems logical to expect the
file to end up at newdir/alpha.  Unfortunately, both merge-recursive and
merge-ort currently see this as a rename/rename conflict:

   olddir/a -> newdir/a
vs.
   olddir/a -> newdir/alpha

Suggesting that there's some extra logic we probably want to add
somewhere to allow this case to run without triggering a conflict.  For
now simply document this known issue.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 12:53:48 +09:00
f2acf763e2 work around zsh comment in __git_complete_worktree_paths
[PATCH]: contrib/completion/git-completion.bash, there is a construct
where comment lines are placed between the command that is on
the upstream of a pipe and the command that is on the downstream
of a pipe in __git_complete_worktree_paths function.

Unfortunately, this script is also used by Zsh completion, but
Zsh mishandles this construct when "interactive_comments" option is not
set (by default it is off on macOS), resulting in a breakage:

$ git worktree remove [TAB]
$ git worktree remove __git_complete_worktree_paths:7: command not found: #

Move the comment, even though it explains what happens on the
downstream of the pipe and logically belongs where it is right
now, before the entire pipeline, to work around this problem.

Signed-off-by: Sardorbek Imomaliev <sardorbek.imomaliev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 12:17:23 +09:00
c364b7ef51 trailer: add new .cmd config option
The `trailer.<token>.command` configuration variable
specifies a command (run via the shell, so it does not have
to be a single name or path to the command, but can be a
shell script), and the first occurrence of substring $ARG is
replaced with the value given to the `interpret-trailer`
command for the token in a '--trailer <token>=<value>' argument.

This has three downsides:

* The use of $ARG in the mechanism misleads the users that
the value is passed in the shell variable, and tempt them
to use $ARG more than once, but that would not work, as
the second and subsequent $ARG are not replaced.

* Because $ARG is textually replaced without regard to the
shell language syntax, even '$ARG' (inside a single-quote
pair), which a user would expect to stay intact, would be
replaced, and worse, if the value had an unmatched single
quote (imagine a name like "O'Connor", substituted into
NAME='$ARG' to make it NAME='O'Connor'), it would result in
a broken command that is not syntactically correct (or
worse).

* The first occurrence of substring `$ARG` will be replaced
with the empty string, in the command when the command is
first called to add a trailer with the specified <token>.
This is a bad design, the nature of automatic execution
causes it to add a trailer that we don't expect.

Introduce a new `trailer.<token>.cmd` configuration that
takes higher precedence to deprecate and eventually remove
`trailer.<token>.command`, which passes the value as an
argument to the command.  Instead of "$ARG", users can
refer to the value as positional argument, $1, in their
scripts. At the same time, in order to allow
`git interpret-trailers` to better simulate the behavior
of `git command -s`, 'trailer.<token>.cmd' will not
automatically execute.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 12:09:43 +09:00
57dcb6575b docs: correct descript of trailer.<token>.command
In the original documentation of `trailer.<token>.command`,
some descriptions are easily misunderstood. So let's modify
it to increase its readability.

In addition, clarify that `$ARG` in command can only be
replaced once.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 12:09:43 +09:00
8ff06de10c docs: document symlink restrictions for dot-files
We stopped allowing symlinks for .gitmodules files in 10ecfa7649
(verify_path: disallow symlinks in .gitmodules, 2018-05-04), and we
stopped following symlinks for .gitattributes, .gitignore, and .mailmap
in the commits from 204333b015 (Merge branch 'jk/open-dotgitx-with-nofollow',
2021-03-22). The reasons are discussed in detail there, but we never
adjusted the documentation to let users know.

This hasn't been a big deal since the point is that such setups were
mildly broken and thought to be unusual anyway. But it certainly doesn't
hurt to be clear and explicit about it.

Suggested-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:03 +09:00
bb6832d552 fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW
In the commits merged in via 204333b015 (Merge branch
'jk/open-dotgitx-with-nofollow', 2021-03-22), we stopped following
symbolic links for .gitattributes, .gitignore, and .mailmap files.

Let's teach fsck to warn that these symlinks are not going to do
anything. Note that this is just a warning, and won't block the objects
via transfer.fsckObjects, since there are reported to be cases of this
in the wild (and even once fixed, they will continue to exist in the
commit history of those projects, but are not particularly dangerous).

Note that we won't add these to the existing gitmodules block in the
fsck code. The logic for gitmodules is a bit more complicated, as we
also check the content of non-symlink instances we find. But for these
new files, there is no content check; we're just looking at the name and
mode of the tree entry (and we can avoid even the complicated name
checks in the common case that the mode doesn't indicate a symlink).

We can reuse the test helper function we defined for .gitmodules, though
(it needs some slight adjustments for the fsck error code, and because
we don't block these symlinks via verify_path()).

Note that I didn't explicitly test the transfer.fsckObjects case here
(nor does the existing .gitmodules test that it blocks a push). The
translation of fsck severities to outcomes is covered in general in
t5504.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:02 +09:00
801ed010bf t0060: test ntfs/hfs-obscured dotfiles
We have tests that cover various filesystem-specific spellings of
".gitmodules", because we need to reliably identify that path for some
security checks. These are from dc2d9ba318 (is_{hfs,ntfs}_dotgitmodules:
add tests, 2018-05-12), with the actual code coming from e7cb0b4455
(is_ntfs_dotgit: match other .git files, 2018-05-11) and 0fc333ba20
(is_hfs_dotgit: match other .git files, 2018-05-02).

Those latter two commits also added similar matching functions for
.gitattributes and .gitignore. These ended up not being used in the
final series, and are currently dead code. But in preparation for them
being used in some fsck checks, let's make sure they actually work by
throwing a few basic tests at them. Likewise, let's cover .mailmap
(which does need matching code added).

I didn't bother with the whole battery of tests that we cover for
.gitmodules. These functions are all based on the same generic matcher,
so it's sufficient to test most of the corner cases just once.

Note that the ntfs magic prefix names in the tests come from the
algorithm described in e7cb0b4455 (and are different for each file).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:02 +09:00
1cb12f3339 t7450: test .gitmodules symlink matching against obscured names
In t7450 we check that both verify_path() and fsck catch malformed
.gitmodules entries in trees. However, we don't check that we catch
filesystem-equivalent forms of these (e.g., ".GITMOD~1" on Windows).
Our name-matching functions are exercised well in t0060, but there's
nothing to test that we correctly call the matching functions from the
actual fsck and verify_path() code.

So instead of testing just .gitmodules, let's repeat our tests for a few
basic cases. We don't need to be exhaustive here (t0060 handles that),
but just make sure we hit one name of each type.

Besides pushing the tests into a function that takes the path as a
parameter, we'll need to do a few things:

  - adjust the directory name to accommodate the tests running multiple
    times

  - set core.protecthfs for index checks. Fsck always protects all types
    by default, but we want to be able to exercise the HFS routines on
    every system. Note that core.protectntfs is already the default
    these days, but it doesn't hurt to explicitly label our need for it.

  - we'll also take the filename ("gitmodules") as a parameter. All
    calls use the same name for now, but a future patch will extend this
    to handle other .gitfoo files. Note that our fake-content symlink
    destination is somewhat .gitmodules specific. But it isn't necessary
    for other files (which don't do a content check). And it happens to
    be a valid attribute and ignore file anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:02 +09:00
a1ca398ba7 t7450: test verify_path() handling of gitmodules
Commit 10ecfa7649 (verify_path: disallow symlinks in .gitmodules,
2018-05-04) made it impossible to load a symlink .gitmodules file into
the index. However, there are no tests of this behavior. Let's make sure
this case is covered. We can easily reuse the test setup created by
the matching b7b1fca175 (fsck: complain when .gitmodules is a symlink,
2018-05-04).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:02 +09:00
43a2220f19 t7415: rename to expand scope
This script has already expanded beyond its original intent of ".. in
submodule names" to include other malicious submodule bits. Let's update
the name and description to reflect that, as well as the fact that we'll
soon be adding similar tests for other dotfiles (.gitattributes, etc).
We'll also renumber it to move it out of the group of submodule-specific
tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:41:08 +09:00
0282f6799f fsck_tree(): wrap some long lines
Many calls to report() in fsck_tree() are kept on a single line and are
quite long. Most were pretty big to begin with, but have gotten even
longer over the years as we've added more parameters. Let's accept the
churn of wrapping them in order to conform to our usual line limits.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:41:08 +09:00
9e1947cb48 fsck_tree(): fix shadowed variable
Commit b2f2039c2b (fsck: accept an oid instead of a "struct tree" for
fsck_tree(), 2019-10-18) introduced a new "oid" parameter to
fsck_tree(), and we pass it to the report() function when we find
problems. However, that is shadowed within the tree-walking loop by the
existing "oid" variable which we use to store the oid of each tree
entry. As a result, we may report the wrong oid for some problems we
detect within the loop (the entry oid, instead of the tree oid).

Our tests didn't catch this because they checked only that we found the
expected fsck problem, not that it was attached to the correct object.

Let's rename both variables in the function to avoid confusion. This
makes the diff a little noisy (e.g., all of the report() calls outside
the loop were already correct but need to be touched), but makes sure we
catch all cases and will avoid similar confusion in the future.

And we can update the test to be a bit more specific and catch this
problem.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:41:08 +09:00
963d02a24a t7415: remove out-dated comment about translation
Since GETTEXT_POISON does not exist anymore, there is no point warning
people about whether we should use test_i18ngrep. This is doubly
confusing because the comment was describing why it was OK to use grep,
but it got caught up in the mass conversion of 674ba34038 (fsck: mark
strings for translation, 2018-11-10).

Note there are other uses of test_i18ngrep in this script which are now
obsolete; I'll save those for a mass-cleanup. My goal here was just to
fix the confusing comment in code I'm about to refactor.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:41:08 +09:00
8e0601f568 docs/format-patch: mention handling of merges
Format-patch doesn't have a way to format merges in a way that can be
applied by git-am (or any other tool), and so it just omits them.
However, this may be a surprising implication for users who are not well
versed in how the tool works. Let's add a note to the documentation
making this more clear.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:32:39 +09:00
6d52b6a5df pack-objects: clamp negative depth to 0
A negative delta depth makes no sense, and the code is not prepared to
handle it. If passed "--depth=-1" on the command line, then this line
from break_delta_chains():

	cur->depth = (total_depth--) % (depth + 1);

triggers a divide-by-zero. This is undefined behavior according to the C
standard, but on POSIX systems results in SIGFPE killing the process.
This is certainly one way to inform the use that the command was
invalid, but it's a bit friendlier to just treat it as "don't allow any
deltas", which we already do for --depth=0.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:30:46 +09:00
49ac1d33bb t5316: check behavior of pack-objects --depth=0
We'd expect this to cleanly produce no deltas at all (as opposed to
getting confused by an out-of-bounds value), and it does.

Note we have to adjust our max_chain test helper, which expected to find
at least one delta.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:56 +09:00
953aa54e1a pack-objects: clamp negative window size to 0
A negative window size makes no sense, and the code in find_deltas() is
not prepared to handle it. If you pass "-1", for example, we end up
generate a 0-length array of "struct unpacked", but our loop assumes it
has at least one entry in it (and we end up reading garbage memory).

We could complain to the user about this, but it's more forgiving to
just clamp it to 0, which means "do not find any deltas at all". The
0-case is already tested earlier in the script, so we'll make sure this
does the same thing.

Reported-by: Yiyuan guo <yguoaz@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:27 +09:00
95356789ee t5300: check that we produced expected number of deltas
We pack a set of objects both with and without --window=0, assuming that
the 0-length window will cause us not to produce any deltas. Let's
confirm that this is the case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:16 +09:00
5489899812 t5300: modernize basic tests
The first set of tests in t5300 goes back to 2005, and doesn't use some
of our customary style and tools these days. In preparation for touching
them, let's modernize a few things:

  - titles go on the line with test_expect_success, with a hanging
    open-quote to start the test body

  - test bodies should be indented with tabs

  - opening braces for shell blocks in &&-chains go on their own line

  - no space between redirect operators and files (">foo", not "> foo")

  - avoid doing work outside of test blocks; in this case, we can stick
    the setup of ".git2" into the appropriate blocks

  - avoid modifying and then cleaning up the environment or current
    directory by using subshells and "git -C"

  - this test does a curious thing when testing the unpacking: it sets
    GIT_OBJECT_DIRECTORY, and then does a "git init" in the _original_
    directory, creating a weird mixed situation. Instead, it's much
    simpler to just "git init --bare" a new repository to unpack into,
    and check the results there. I renamed this "git2" instead of
    ".git2" to make it more clear it's a separate repo.

  - we can observe that the bodies of the no-delta, ref_delta, and
    ofs_delta cases are all virtually identical except for the pack
    creation, and factor out shared helper functions. I collapsed "do
    the unpack" and "check the results of the unpack" into a single
    test, since that makes the expected lifetime of the "git2" temporary
    directory more clear (that also lets us use test_when_finished to
    clean it up). This does make the "-v" output slightly less useful,
    but the improvement in reading the actual test code makes it worth
    it.

  - I dropped the "pwd" calls from some tests. These don't do anything
    functional, and I suspect may have been an aid for debugging when
    the script was more cavalier about leaving the working directory
    changed between tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:16 +09:00
a84fd3bcc6 CodingGuidelines: explicitly allow "local" for test scripts
01d3a526 (t0000: check whether the shell supports the "local"
keyword, 2017-10-26) raised a test balloon to see if those who build
and test Git use a platform with a shell that lacks support for the
"local" keyword.  After two years, 7f0b5908 (t0000: reword comments
for "local" test, 2019-08-08) documented that "local" keyword, even
though is outside POSIX, is allowed in our test scripts.

Let's write it in the CodingGuidelines, too.  It might be tempting
to allow it in scripted Porcelains (we have avoided getting them
contaminiated by "local" so far), but they are on their way out and
getting rewritten in C.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:24:11 +09:00
ad9322da03 merge: fix swapped "up to date" message components
The rewrite of git-merge from shell to C in 1c7b76be7d (Build in merge,
2008-07-07) accidentally transformed the message:

    Already up-to-date. (nothing to squash)

to:

    (nothing to squash)Already up-to-date.

due to reversed printf() arguments. This problem has gone unnoticed
despite being touched over the years by 7f87aff22c (Teach/Fix pull/fetch
-q/-v options, 2008-11-15) and bacec47845 (i18n: git-merge basic
messages, 2011-02-22), and tangentially by bef4830e88 (i18n: merge: mark
messages for translation, 2016-06-17) and 7560f547e6 (treewide: correct
several "up-to-date" to "up to date", 2017-08-23).

Fix it by restoring the message to its intended order. While at it, help
translators out by avoiding "sentence Lego".

[es: rewrote commit message]

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:14:58 +09:00
80cde95eec merge(s): apply consistent punctuation to "up to date" messages
Although the various "Already up to date" messages resulting from merge
attempts share identical phrasing, they use a mix of punctuation ranging
from "." to "!" and even "Yeeah!", which leads to extra work for
translators. Ease the job of translators by settling upon "." as
punctuation for all such messages.

While at it, take advantage of printf_ln() to further ease the
translation task so translators need not worry about line termination,
and fix a case of missing line termination in the (unused)
merge_ort_nonrecursive() function.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:14:56 +09:00
62af4bdd42 submodule update: silence underlying fetch with "--quiet"
Commands such as

    $ git submodule update --quiet --init --depth=1

involving shallow clones, call the shell function fetch_in_submodule, which
in turn invokes git fetch.  Pass the --quiet option onward there.

Signed-off-by: Nicholas Clark <nick@ccl4.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 12:24:38 +09:00
7e39198978 The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 13:50:27 +09:00
93e0b28dbb Merge branch 'ab/pathname-encoding-doc'
Clarify that pathnames recorded in Git trees are most often (but
not necessarily) encoded in UTF-8.

* ab/pathname-encoding-doc:
  doc: clarify the filename encoding in git diff
2021-04-30 13:50:27 +09:00
5980e0d442 Merge branch 'vs/completion-with-set-u'
Effort to make the command line completion (in contrib/) safe with
"set -u" continues.

* vs/completion-with-set-u:
  completion: avoid aliased command lookup error in nounset mode
2021-04-30 13:50:27 +09:00
bf0d4c8491 Merge branch 'hn/refs-trace-errno'
Show errno in the trace output in the error codepath that calls
read_raw_ref method.

* hn/refs-trace-errno:
  refs: print errno for read_raw_ref if GIT_TRACE_REFS is set
2021-04-30 13:50:27 +09:00
8e97852919 Merge branch 'ds/sparse-index-protections'
Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.

* ds/sparse-index-protections: (47 commits)
  name-hash: use expand_to_path()
  sparse-index: expand_to_path()
  name-hash: don't add directories to name_hash
  revision: ensure full index
  resolve-undo: ensure full index
  read-cache: ensure full index
  pathspec: ensure full index
  merge-recursive: ensure full index
  entry: ensure full index
  dir: ensure full index
  update-index: ensure full index
  stash: ensure full index
  rm: ensure full index
  merge-index: ensure full index
  ls-files: ensure full index
  grep: ensure full index
  fsck: ensure full index
  difftool: ensure full index
  commit: ensure full index
  checkout: ensure full index
  ...
2021-04-30 13:50:26 +09:00
a1cac26cc6 Merge branch 'mt/parallel-checkout-part-2'
The checkout machinery has been taught to perform the actual
write-out of the files in parallel when able.

* mt/parallel-checkout-part-2:
  parallel-checkout: add design documentation
  parallel-checkout: support progress displaying
  parallel-checkout: add configuration options
  parallel-checkout: make it truly parallel
  unpack-trees: add basic support for parallel checkout
2021-04-30 13:50:26 +09:00
59bb0aa93e Merge branch 'so/log-diff-merge'
"git log" learned "--diff-merges=<style>" option, with an
associated configuration variable log.diffMerges.

* so/log-diff-merge:
  doc/diff-options: document new --diff-merges features
  diff-merges: introduce log.diffMerges config variable
  diff-merges: adapt -m to enable default diff format
  diff-merges: refactor set_diff_merges()
  diff-merges: introduce --diff-merges=on
2021-04-30 13:50:26 +09:00
d250f90359 Merge branch 'ds/maintenance-prefetch-fix'
The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.

* ds/maintenance-prefetch-fix:
  maintenance: respect remote.*.skipFetchAll
  maintenance: use 'git fetch --prefetch'
  fetch: add --prefetch option
  maintenance: simplify prefetch logic
2021-04-30 13:50:25 +09:00
a819e2b3ef Merge branch 'ow/push-quiet-set-upstream'
"git push --quiet --set-upstream" was not quiet when setting the
upstream branch configuration, which has been corrected.

* ow/push-quiet-set-upstream:
  transport: respect verbosity when setting upstream
2021-04-30 13:50:25 +09:00
279a2e637a Merge branch 'mt/pkt-write-errors'
When packet_write() fails, we gave an extra error message
unnecessarily, which has been corrected.

* mt/pkt-write-errors:
  pkt-line: do not report packet write errors twice
2021-04-30 13:50:24 +09:00
13158b9910 Merge branch 'jk/promisor-optim'
Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).

* jk/promisor-optim:
  revision: avoid parsing with --exclude-promisor-objects
  lookup_unknown_object(): take a repository argument
  is_promisor_object(): free tree buffer after parsing
2021-04-30 13:50:24 +09:00
4cd66e7d6b bisect--helper: use BISECT_TERMS in 'bisect skip' command
Commit e4c7b33747 ("bisect--helper: reimplement `bisect_skip` shell
function in C", 2021-02-03), as part of the shell-to-C conversion,
forgot to read the 'terms' file (.git/BISECT_TERMS) during the new
'bisect skip' command implementation. As a result, the 'bisect skip'
command will use the default 'bad'/'good' terms. If the bisection
terms have been set to non-default values (for example by the
'bisect start' command), then the 'bisect skip' command will fail.

In order to correct this problem, we insert a call to the get_terms()
function, which reads the non-default terms from that file (if set),
in the '--bisect-skip' command implementation of 'bisect--helper'.

Also, add a test[1] to protect against potential future regression.

[1] https://lore.kernel.org/git/xmqqim45h585.fsf@gitster.g/T/#m207791568054b0f8cf1a3942878ea36293273c7d

Reported-by: Trygve Aaberge <trygveaa@gmail.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 09:56:42 +09:00
bccc37fdc7 cygwin: disallow backslashes in file names
The backslash character is not a valid part of a file name on Windows.
If, in Windows, Git attempts to write a file that has a backslash
character in the filename, it will be incorrectly interpreted as a
directory separator.

This caused CVE-2019-1354 in MinGW, as this behaviour can be manipulated
to cause the checkout to write to files it ought not write to, such as
adding code to the .git/hooks directory.  This was fixed by e1d911dd4c
(mingw: disallow backslash characters in tree objects' file names,
2019-09-12).  However, the vulnerability also exists in Cygwin: while
Cygwin mostly provides a POSIX-like path system, it will still interpret
a backslash as a directory separator.

To avoid this vulnerability, CVE-2021-29468, extend the previous fix to
also apply to Cygwin.

Similarly, extend the test case added by the previous version of the
commit.  The test suite doesn't have an easy way to say "run this test
if in MinGW or Cygwin", so add a new test prerequisite that covers both.

As well as checking behaviour in the presence of paths containing
backslashes, the existing test also checks behaviour in the presence of
paths that differ only by the presence of a trailing ".".  MinGW follows
normal Windows application behaviour and treats them as the same path,
but Cygwin more closely emulates *nix systems (at the expense of
compatibility with native Windows applications) and will create and
distinguish between such paths.  Gate the relevant bit of that test
accordingly.

Reported-by: RyotaK <security@ryotak.me>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 09:49:20 +09:00
c331551ccf git: support separate arg for --config-env's value
While not documented as such, many of the top-level options like
`--git-dir` and `--work-tree` support two syntaxes: they accept both an
equals sign between option and its value, and they do support option and
value as two separate arguments. The recently added `--config-env`
option only supports the syntax with an equals sign.

Mitigate this inconsistency by accepting both syntaxes and add tests to
verify both work.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 09:46:53 +09:00
9152904c11 git.txt: fix synopsis of --config-env missing the equals sign
When executing `git -h`, then the `--config-env` documentation rightly
lists the option as requiring an equals between the option and its
argument: this is the only currently supported format. But the git(1)
manpage incorrectly lists the option as taking a space in between.

Fix the issue by adding the missing space.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-of-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 09:46:46 +09:00
526705fd3d apply: adjust messages to account for --3way changes
"git apply" specifically calls out when it is falling back to 3way
merge application.  Since the order changed to preferring 3way and
falling back to direct application, continue that behavior by
printing whenever 3way fails and git has to fall back.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-29 12:27:45 +09:00
2ba582ba4c prune: save reachable-from-recent objects with bitmaps
We pass our prune expiration to mark_reachable_objects(), which will
traverse not only the reachable objects, but consider any recent ones as
tips for reachability; see d3038d22f9 (prune: keep objects reachable
from recent objects, 2014-10-15) for details.

However, this interacts badly with the bitmap code path added in
fde67d6896 (prune: use bitmaps for reachability traversal, 2019-02-13).
If we hit the bitmap-optimized path, we return immediately to avoid the
regular traversal, accidentally skipping the "also traverse recent"
code.

Instead, we should do an if-else for the bitmap versus regular
traversal, and then follow up with the "recent" traversal in either
case. This reuses the "rev_info" for a bitmap and then a regular
traversal, but that should work OK (the bitmap code clears the pending
array in the usual way, just like a regular traversal would).

Note that I dropped the comment above the regular traversal here.  It
has little explanatory value, and makes the if-else logic much harder to
read.

Here are a few variants that I rejected:

  - it seems like both the reachability and recent traversals could be
    done in a single traversal. This was rejected by d3038d22f9 (prune:
    keep objects reachable from recent objects, 2014-10-15), though the
    balance may be different when using bitmaps. However, there's a
    subtle correctness issue, too: we use revs->ignore_missing_links for
    the recent traversal, but not the reachability one.

  - we could try using bitmaps for the recent traversal, too, which
    could possibly improve performance. But it would require some fixes
    in the bitmap code, which uses ignore_missing_links for its own
    purposes. Plus it would probably not help all that much in practice.
    We use the reachable tips to generate bitmaps, so those objects are
    likely not covered by bitmaps (unless they just became unreachable).
    And in general, we expect the set of unreachable objects to be much
    smaller anyway, so there's less to gain.

The test in t5304 detects the bug and confirms the fix.

I also beefed up the tests in t6501, which covers the mtime-checking
code more thoroughly, to handle the bitmap case (in addition to just
"loose" and "packed" cases). Interestingly, this test doesn't actually
detect the bug, because it is running "git gc", and not "prune"
directly. And "gc" will call "repack" first, which does not suffer the
same bug. So the old-but-reachable-from-recent objects get scooped up
into the new pack along with the actually-recent objects, which gives
both a recent mtime. But it seemed prudent to get more coverage of the
bitmap case for related code.

Reported-by: David Emett <dave@sp4m.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-29 10:38:25 +09:00
1e951c6473 pack-bitmap: clean up include_check after use
When a bitmap walk has to traverse (to fill in non-bitmapped objects),
we use rev_info's include_check mechanism to let us stop the traversal
early. But after setting the function and its data parameter, we never
clean it up. This means that if the rev_info is used for a subsequent
traversal without bitmaps, it will unexpectedly call into our
include_check function (worse, it will do so pointing to a now-defunct
stack variable in include_check_data, likely resulting in a segfault).

There's no code which does this now, but it's an accident waiting to
happen. Let's clean up after ourselves in the bitmap code.

Reported-by: David Emett <dave@sp4m.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-29 10:03:46 +09:00
9a3e3ca2ba subtree: be stricter about validating flags
Don't silently ignore a flag that's invalid for a given subcommand.  The
user expected it to do something; we should tell the user that they are
mistaken, instead of surprising the user.

It could be argued that this change might break existing users.  I'd
argue that those existing users are already broken, and they just don't
know it.  Let them know that they're broken.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
49470cd445 subtree: push: allow specifying a local rev other than HEAD
'git subtree split' lets you specify a rev other than HEAD.  'git push'
lets you specify a mapping between a local thing and a remot ref.  So
smash those together, and have 'git subtree push' let you specify which
local thing to run split on and push the result of that split to the
remote ref.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
94389e7c81 subtree: allow 'split' flags to be passed to 'push'
'push' does a 'split' internally, but it doesn't pass flags through to the
'split'.  This is silly, if you need to pass flags to 'split', then it
means that you can't use 'push'!

So, have 'push' accept 'split' flags, and pass them through to 'split'.

Add tests for this by copying split's tests with minimal modification.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
cb6551447b subtree: allow --squash to be used with --rejoin
Besides being a genuinely useful thing to do, this also just makes sense
and harmonizes which flags may be used when.  `git subtree split
--rejoin` amounts to "automatically go ahead and do a `git subtree
merge` after doing the main `git subtree split`", so it's weird and
arbitrary that you can't pass `--squash` to `git subtree split --rejoin`
like you can `git subtree merge`.  It's weird that `git subtree split
--rejoin` inherits `git subtree merge`'s `--message` but not `--squash`.

Reconcile the situation by just having `split --rejoin` actually just
call `merge` internally (or call `add` instead, as appropriate), so it
can get access to the full `merge` behavior, including `--squash`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
6468784dd2 subtree: give the docs a once-over
Just went through the docs looking for anything inaccurate or that can
be improved.

In the '-h' text, in the man page synopsis, and in the man page
description: Normalize the ordering of the list of sub-commands: 'add',
'merge', 'split', 'pull', 'push'.  This allows us to kinda separate the
lower-level add/merge/split from the higher-level pull/push.

'-h' text:
 - correction: Indicate that split's arg is optional.
 - clarity: Emphasize that 'pull' takes the 'add'/'merge' flags.

man page:

 - correction: State that all subcommands take options (it seemed to
   indicate that only 'split' takes any options other than '-P').
 - correction: 'split' only guarantees that the results are identical if
   the flags are identical.
 - correction: The flag is named '--ignore-joins', not '--ignore-join'.
 - completeness: Clarify that 'push' always operates on HEAD, and that
   'split' operates on HEAD if no local commit is given.
 - clarity: In the description, when listing commands, repeat what their
   arguments are.  This way the reader doesn't need to flip back and
   forth between the command description and the synopsis and the full
   description to understand what's being said.
 - clarity: In the <variables> used to give command arguments, give
   slightly longer, descriptive names.  Like <local-commit> instead of
   just <commit>.
 - clarity: Emphasize that 'pull' takes the 'add'/'merge' flags.
 - style: In the synopsis, list options before the subcommand.  This
   makes things line up and be much more readable when shown
   non-monospace (such as in `make html`), and also more closely matches
   other man pages (like `git-submodule.txt`).
 - style: Use the correct syntax for indicating the options ([<options>]
   instead of [OPTIONS]).
 - style: In the synopsis, separate 'pull' and 'push' from the other
   lower-level commands.  I think this helps readability.
 - style: Code-quote things in prose that seem like they should be
   code-quoted, like '.gitmodules', flags, or full commands.
 - style: Minor wording improvements, like more consistent mood (many
   of the command descriptions start in the imperative mood and switch
   to the indicative mode by the end).  That sort of thing.
 - style: Capitalize "ID".
 - style: Remove the "This option is only valid for XXX command" remarks
   from each option, and instead rely on the section headings.
 - style: Since that line is getting edited anyway, switch "behaviour" to
   American "behavior".
 - style: Trim trailing whitespace.

`todo`:
 - style: Trim trailing whitespace.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
e9525a8a02 subtree: have $indent actually affect indentation
Currently, the $indent variable is just used to track how deeply we're
nested, and the debug log is indented by things like

   debug "  foo"

That is: The indentation-level is hard-coded.  It used to be that the
code couldn't recurse, so the indentation level could be known
statically, so it made sense to just hard-code it in the
output. However, since 315a84f9aa ("subtree: use commits before rejoins
for splits", 2018-09-28), it can now recurse, and the debug log is
misleading.

So fix that.  Indent according to $indent.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
534ff90dbd subtree: don't let debug and progress output clash
Currently, debug output (triggered by passing '-d') and progress output
stomp on each other.  The debug output is just streamed as lines to
stderr, and the progress output is sent to stderr as '%s\r'.  When
writing to a file, it is awkward to read and difficult to distinguish
between the debug output and a progress line.  When writing to a
terminal the debug lines hide progress lines.

So, when '-d' has been passed, spit out progress as 'progress: %s\n',
instead of as '%s\r', so that it can be detected, and so that the debug
lines don't overwrite the progress when written to a terminal.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
5cdae0f6fd subtree: add comments and sanity checks
For each function in subtree, add a usage comment saying what the
arguments are, and add an `assert` checking the number of arguments.

In figuring out each thing's arguments in order to write those comments
and assertions, it turns out that find_existing_splits is written as if
it takes multiple 'revs', but it is in fact only ever passed a single
'rev':

	unrevs="$(find_existing_splits "$dir" "$rev")" || exit $?

So go ahead and codify that by documenting and asserting that it takes
exactly two arguments, one dir and one rev.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
cbb5de8b83 subtree: remove duplicate check
`cmd_add` starts with a check that the directory doesn't yet exist.
However, the `main` function performs the exact same check before
calling `cmd_add`.  So remove the check from `cmd_add`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
e4f8baa88a subtree: parse revs in individual cmd_ functions
The main argument parser goes ahead and tries to parse revs to make
things simpler for the sub-command implementations.  But, it includes
enough special cases for different sub-commands.  And it's difficult
having having to think about "is this info coming from an argument, or a
global variable?".  So the main argument parser's effort to make things
"simpler" ends up just making it more confusing and complicated.

Begone with the 'revs' global variable; parse 'rev=$(...)' as needed in
individual 'cmd_*' functions.

Begone with the 'default' global variable.  Its would-be value is
knowable just from which function we're in.

Begone with the 'ensure_single_rev' function.  Its functionality can be
achieved by passing '--verify' to 'git rev-parse'.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
bbffb02383 subtree: use "^{commit}" instead of "^0"
They are synonyms.  Both are used in the file.  ^{commit} is clearer, so
"standardize" on that.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
22d5507493 subtree: don't fuss with PATH
Scripts needing to fuss with with adding $(git --exec-prefix) PATH
before loading git-sh-setup is a thing of the past.  As far as I can
tell, it's been a thing of the past since since Git v1.2.0 (2006-02-12),
or more specifically, since 77cb17e940 (Exec git programs without using
PATH, 2006-01-10).  However, it stuck around in contrib scripts and in
third-party scripts for long enough that it wasn't unusual to see.

Originally `git subtree` didn't fuss with PATH, but when people
(including the original subtree author) had problems, because it was a
common thing to see, it seemed that having subtree fuss with PATH was a
reasonable solution.

Here is an abridged history of fussing with PATH in subtree:

  2987e6add3 (Add explicit path of git installation by 'git --exec-path', Gianluca Pacchiella, 2009-08-20)

    As pointed out by documentation, the correct use of 'git-sh-setup' is
    using $(git --exec-path) to avoid problems with not standard
    installations.

    -. git-sh-setup
    +. $(git --exec-path)/git-sh-setup

  33aaa697a2 (Improve patch to use git --exec-path: add to PATH instead, Avery Pennarun, 2009-08-26)

    If you (like me) are using a modified git straight out of its source
    directory (ie. without installing), then --exec-path isn't actually correct.
    Add it to the PATH instead, so if it is correct, it'll work, but if it's
    not, we fall back to the previous behaviour.

    -. $(git --exec-path)/git-sh-setup
    +PATH=$(git --exec-path):$PATH
    +. git-sh-setup

  9c632ea29c ((Hopefully) fix PATH setting for msysgit, Avery Pennarun, 2010-06-24)

    Reported by Evan Shaw.  The problem is that $(git --exec-path) includes a
    'git' binary which is incompatible with the one in /usr/bin; if you run it,
    it gives you an error about libiconv2.dll.

    +OPATH=$PATH
     PATH=$(git --exec-path):$PATH
     . git-sh-setup
    +PATH=$OPATH  # apparently needed for some versions of msysgit

  df2302d774 (Another fix for PATH and msysgit, Avery Pennarun, 2010-06-24)

    Evan Shaw tells me the previous fix didn't work.  Let's use this one
    instead, which he says does work.

    This fix is kind of wrong because it will run the "correct" git-sh-setup
    *after* the one in /usr/bin, if there is one, which could be weird if you
    have multiple versions of git installed.  But it works on my Linux and his
    msysgit, so it's obviously better than what we had before.

    -OPATH=$PATH
    -PATH=$(git --exec-path):$PATH
    +PATH=$PATH:$(git --exec-path)
     . git-sh-setup
    -PATH=$OPATH  # apparently needed for some versions of msysgit

First of all, I disagree with Gianluca's reading of the documentation:
 - I haven't gone back to read what the documentation said in 2009, but
   in my reading of the 2021 documentation is that it includes "$(git
   --exec-path)/" in the synopsis for illustrative purposes, not to say
   it's the proper way.
 - After being executed by `git`, the git exec path should be the very
   first entry in PATH, so it shouldn't matter.
 - None of the scripts that are part of git do it that way.

But secondly, the root reason for fussing with PATH seems to be that
Avery didn't know that he needs to set GIT_EXEC_PATH if he's going to
use git from the source directory without installing.

And finally, Evan's issue is clearly just a bug in msysgit.  I assume
that msysgit has since fixed the issue, and also msysgit has been
deprecated for 6 years now, so let's drop the workaround for it.

So, remove the line fussing with PATH.  However, since subtree *is* in
'contrib/' and it might get installed in funny ways by users
after-the-fact, add a sanity check to the top of the script, checking
that it is installed correctly.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
a94f911072 subtree: use "$*" instead of "$@" as appropriate
"$*" is for when you want to concatenate the args together,
whitespace-separated; and "$@" is for when you want them to be separate
strings.

There are several places in subtree that erroneously use $@ when
concatenating args together into an error message.

For instance, if the args are argv[1]="dead" and argv[2]="beef", then
the line

    die "You must provide exactly one revision.  Got: '$@'"

surely intends to call 'die' with the argument

    argv[1]="You must provide exactly one revision.  Got: 'dead beef'"

however, because the line used $@ instead of $*, it will actually call
'die' with the arguments

    argv[1]="You must provide exactly one revision.  Got: 'dead"
    argv[2]="beef'"

This isn't a big deal, because 'die' concatenates its arguments together
anyway (using "$*").  But that doesn't change the fact that it was a
mistake to use $@ instead of $*, even though in the end $@ still ended
up doing the right thing.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
e2b11e4211 subtree: use more explicit variable names for cmdline args
Make it painfully obvious when reading the code which variables are
direct parsings of command line arguments.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
6d43585a68 subtree: use git-sh-setup's say
subtree currently defines its own `say` implementation, rather than
using git-sh-setups's implementation.  Change that, don't re-invent the
wheel.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
f664304836 subtree: use git merge-base --is-ancestor
Instead of writing a slow `rev_is_descendant_of_branch $a $b` function
in shell, just use the fast `git merge-base --is-ancestor $b $a`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
8dc3240f5f subtree: drop support for git < 1.7
Suport for Git versions older than 1.7.0 (older than February 2010) was
nice to have when git-subtree lived out-of-tree.  But now that it lives
in git.git, it's not necessary to keep around.  While it's technically
in contrib, with the standard 'git' packages for common systems
(including Arch Linux and macOS) including git-subtree, it seems
vanishingly likely to me that people are separately installing
git-subtree from git.git alongside an older 'git' install (although it
also seems vanishingly likely that people are still using >11 year old
git installs).

Not that there's much reason to remove it either, it's not much code,
and none of my changes depend on a newer git (to my knowledge, anyway;
I'm not actually testing against older git).  I just figure it's an easy
piece of fat to trim, in the journey to making the whole thing easier to
hack on.

"Ignore space change" is probably helpful when viewing this diff.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
d2f0f81954 subtree: more consistent error propagation
Ensure that every $(subshell) that calls a function (as opposed to an
external executable) is followed by `|| exit $?`.  Similarly, ensure that
every `cmd | while read; do ... done` loop is followed by `|| exit $?`.

Both of those constructs mean that it can miss `die` calls, and keep
running when it shouldn't.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
5a3569774f subtree: don't have loose code outside of a function
Shove all of the loose code inside of a main() function.

This comes down to personal preference more than anything else.  A
preference that I've developed over years of maintaining large Bash
scripts, but still a mere personal preference.

In this specific case, it's also moving the `set -- -h`, the `git
rev-parse --parseopt`, and the `. git-sh-setup` to be closer to all
the rest of the argument parsing, which is a readability win on its
own, IMO.

"Ignore space change" is probably helpful when viewing this diff.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
b04538d99f subtree: t7900: add porcelain tests for 'pull' and 'push'
The 'pull' and 'push' subcommands deserve their own sections in the tests.
Add some basic tests for them.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
b269976979 subtree: t7900: add a test for the -h flag
It's a dumb test, but it's surprisingly easy to break.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
db6952b2b2 subtree: t7900: rename last_commit_message to last_commit_subject
t7900-subtree.sh defines a helper function named last_commit_message.
However, it only returns the subject line of the commit message, not the
entire commit message.  So rename it, to make the name less confusing.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
f1cd2d93c2 subtree: t7900: fix 'verify one file change per commit'
As far as I can tell, this test isn't actually testing anything, because
someone forgot to tack on `--name-only` to `git log`.  This seems to
have been the case since the test was first written, back in fa16ab36ad
("test.sh: make sure no commit changes more than one file at a time.",
2009-04-26), unless `git log` used to do that by default and didn't need
the flag back then?

Convincing myself that it's not actually testing anything was tricky,
the code is a little hard to reason about.  It can be made a lot simpler
if instead of trying to parse all of the info from a single `git log`,
we're OK calling `git log` from inside of a loop.  And it's my opinion
that tests are not the place for clever optimized code.

So, fix and simplify the test, so that it's actually testing something
and is simpler to reason about.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
63ac4f1ade subtree: t7900: delete some dead code
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
c4566ab429 subtree: t7900: use 'test' for string equality
t7900-subtree.sh defines its own `check_equal A B` function, instead of
just using `test A = B` like all of the other tests.  Don't be special,
get rid of `check_equal` in favor of `test`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
40b1e1ec58 subtree: t7900: comment subtree_test_create_repo
It's unclear what the purpose of t7900-subtree.sh's
`subtree_test_create_repo` helper function is.  It wraps test-lib.sh's,
`test_create_repo` but follows that up by setting log.date=relative.  Why
does it set log.date=relative?

My first guess was that at one point the tests required that, but no
longer do, and that the function is now vestigial.  I even wrote a patch
to get rid of it and was moments away from `git send-email`ing it.

However, by chance when looking for something else in the history, I
discovered the true reason, from e7aac44ed2 (contrib/subtree: ignore
log.date configuration, 2015-07-21).  It's testing that setting
log.date=relative doesn't break `git subtree`, as at one point in the past
that did break `git subtree`.

So, add a comment about this, to avoid future such confusion.

And while at it, go ahead and (1) touch up the function to avoid a
pointless subshell and (2) update the one test that didn't use it.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
f700406957 subtree: t7900: use consistent formatting
The formatting in t7900-subtree.sh isn't even consistent throughout the
file.  Fix that; make it consistent throughout the file.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
f2bb7fef7a subtree: t7900: use test-lib.sh's test_count
Use test-lib.sh's `test_count`, instead instead of having
t7900-subtree.sh do its own book-keeping with `subtree_test_count` that
has to be explicitly incremented by calling `next_test`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
914d512551 subtree: t7900: update for having the default branch name be 'main'
Most of the tests had been converted to support
`GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`, but `contrib/subtree/t/`
hadn't.

Convert it.  Most of the mentions of 'master' can just be replaced with
'HEAD'.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
4c996deb4a .gitignore: ignore 'git-subtree' as a build artifact
Running `make -C contrib/subtree/ test` creates a `git-subtree` executable
in the root of the repo.  Add it to the .gitignore so that anyone hacking
on subtree won't have to deal with that noise.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:46:30 +09:00
a643157d5a repack: avoid loosening promisor objects in partial clones
When `git repack -A -d` is run in a partial clone, `pack-objects`
is invoked twice: once to repack all promisor objects, and once to
repack all non-promisor objects. The latter `pack-objects` invocation
is with --exclude-promisor-objects and --unpack-unreachable, which
loosens all objects unused during this invocation. Unfortunately,
this includes promisor objects.

Because the -d argument to `git repack` subsequently deletes all loose
objects also in packs, these just-loosened promisor objects will be
immediately deleted. However, this extra disk churn is unnecessary in
the first place.  For example, in a newly-cloned partial repo that
filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up
unpacking all trees and commits into the filesystem because every
object, in this particular case, is a promisor object. Depending on
the repo size, this increases the disk usage considerably: In my copy
of the linux.git, the object directory peaked 26GB of more disk usage.

In order to avoid this extra disk churn, pass the names of the promisor
packfiles as --keep-pack arguments to the second invocation of
`pack-objects`. This informs `pack-objects` that the promisor objects
are already in a safe packfile and, therefore, do not need to be
loosened.

For testing, we need to validate whether any object was loosened.
However, the "evidence" (loosened objects) is deleted during the
process which prevents us from inspecting the object directory.
Instead, let's teach `pack-objects` to count loosened objects and
emit via trace2 thus allowing inspecting the debug events after the
process is finished. This new event is used on the added regression
test.

Lastly, add a new perf test to evaluate the performance impact
made by this changes (tested on git.git):

     Test          HEAD^                 HEAD
     ----------------------------------------------------------
     5600.3: gc    134.38(41.93+90.95)   7.80(6.72+1.35) -94.2%

For a bigger repository, such as linux.git, the improvement is
even bigger:

     Test          HEAD^                     HEAD
     -------------------------------------------------------------------
     5600.3: gc    6833.00(918.07+3162.74)   268.79(227.02+39.18) -96.1%

These improvements are particular big because every object in the
newly-cloned partial repository is a promisor object.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 13:36:13 +09:00
7a14acdbe6 doc: point to diff attribute in patch format docs
From the documentation for generating patch text with diff-related
commands, refer to the documentation for the diff attribute.

This attribute influences the way that patches are generated, but this
was previously not mentioned in e.g., the git-diff manpage.

Signed-off-by: Peter Oliver <git@mavit.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 13:34:44 +09:00
37be11994f builtin/rm: avoid leaking pathspec and seen
parse_pathspec() populates pathspec, hence we need to clear it once it's
no longer needed. seen is xcalloc'd within the same function and
likewise needs to be freed once its no longer needed.

cmd_rm() has multiple early returns, therefore we need to clear or free
as soon as this data is no longer needed, as opposed to doing a cleanup
at the end.

LSAN output from t0020:

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ac0a4 in do_xmalloc wrapper.c:41:8
    #2 0x9ac07a in xmalloc wrapper.c:62:9
    #3 0x873277 in parse_pathspec pathspec.c:582:2
    #4 0x646ffa in cmd_rm builtin/rm.c:266:2
    #5 0x4cd91d in run_builtin git.c:467:11
    #6 0x4cb5f3 in handle_builtin git.c:719:3
    #7 0x4ccf47 in run_argv git.c:808:4
    #8 0x4caf49 in cmd_main git.c:939:19
    #9 0x69dc0e in main common-main.c:52:11
    #10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ac2a6 in xrealloc wrapper.c:126:8
    #2 0x93b14d in strbuf_grow strbuf.c:98:2
    #3 0x93ccf6 in strbuf_vaddf strbuf.c:392:3
    #4 0x93f726 in xstrvfmt strbuf.c:979:2
    #5 0x93f8b3 in xstrfmt strbuf.c:989:8
    #6 0x92ad8a in prefix_path_gently setup.c:115:15
    #7 0x873a8d in init_pathspec_item pathspec.c:439:11
    #8 0x87334f in parse_pathspec pathspec.c:589:3
    #9 0x646ffa in cmd_rm builtin/rm.c:266:2
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dc0e in main common-main.c:52:11
    #15 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 15 byte(s) in 1 object(s) allocated from:
    #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ac048 in xstrdup wrapper.c:29:14
    #2 0x873ba2 in init_pathspec_item pathspec.c:468:20
    #3 0x87334f in parse_pathspec pathspec.c:589:3
    #4 0x646ffa in cmd_rm builtin/rm.c:266:2
    #5 0x4cd91d in run_builtin git.c:467:11
    #6 0x4cb5f3 in handle_builtin git.c:719:3
    #7 0x4ccf47 in run_argv git.c:808:4
    #8 0x4caf49 in cmd_main git.c:939:19
    #9 0x69dc0e in main common-main.c:52:11
    #10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 1 byte(s) in 1 object(s) allocated from:
    #0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9ac392 in xcalloc wrapper.c:140:8
    #2 0x647108 in cmd_rm builtin/rm.c:294:9
    #3 0x4cd91d in run_builtin git.c:467:11
    #4 0x4cb5f3 in handle_builtin git.c:719:3
    #5 0x4ccf47 in run_argv git.c:808:4
    #6 0x4caf49 in cmd_main git.c:939:19
    #7 0x69dbfe in main common-main.c:52:11
    #8 0x7f4fac1b0349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
805b789a69 builtin/rebase: release git_format_patch_opt too
options.git_format_patch_opt can be populated during cmd_rebase's setup,
and will therefore leak on return. Although we could just UNLEAK all of
options, we choose to strbuf_release() the individual member, which matches
the existing pattern (where we're freeing invidual members of options).

Leak found when running t0021:

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ac296 in xrealloc wrapper.c:126:8
    #2 0x93b13d in strbuf_grow strbuf.c:98:2
    #3 0x93bd3a in strbuf_add strbuf.c:295:2
    #4 0x60ae92 in strbuf_addstr strbuf.h:304:2
    #5 0x605f17 in cmd_rebase builtin/rebase.c:1759:3
    #6 0x4cd91d in run_builtin git.c:467:11
    #7 0x4cb5f3 in handle_builtin git.c:719:3
    #8 0x4ccf47 in run_argv git.c:808:4
    #9 0x4caf49 in cmd_main git.c:939:19
    #10 0x69dbfe in main common-main.c:52:11
    #11 0x7f66dae91349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
a317a553b8 builtin/for-each-ref: free filter and UNLEAK sorting.
sorting might be a list allocated in ref_default_sorting() (in this case
it's a fixed single item list, which has nevertheless been xcalloc'd),
or it might be a list allocated in parse_opt_ref_sorting(). In either
case we could free these lists - but instead we UNLEAK as we're at the
end of cmd_for_each_ref. (There's no existing implementation of
clear_ref_sorting(), and writing a loop to free the list seems more
trouble than it's worth.)

filter.with_commit/no_commit are populated via
OPT_CONTAINS/OPT_NO_CONTAINS, both of which create new entries via
parse_opt_commits(), and also need to be free'd or UNLEAK'd. Because
free_commit_list() already exists, we choose to use that over an UNLEAK.

LSAN output from t0041:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9ac252 in xcalloc wrapper.c:140:8
    #2 0x8a4a55 in ref_default_sorting ref-filter.c:2486:32
    #3 0x56c6b1 in cmd_for_each_ref builtin/for-each-ref.c:72:13
    #4 0x4cd91d in run_builtin git.c:467:11
    #5 0x4cb5f3 in handle_builtin git.c:719:3
    #6 0x4ccf47 in run_argv git.c:808:4
    #7 0x4caf49 in cmd_main git.c:939:19
    #8 0x69dabe in main common-main.c:52:11
    #9 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9abf54 in do_xmalloc wrapper.c:41:8
    #2 0x9abf2a in xmalloc wrapper.c:62:9
    #3 0x717486 in commit_list_insert commit.c:540:33
    #4 0x8644cf in parse_opt_commits parse-options-cb.c:98:2
    #5 0x869bb5 in get_value parse-options.c:181:11
    #6 0x8677dc in parse_long_opt parse-options.c:378:10
    #7 0x8659bd in parse_options_step parse-options.c:817:11
    #8 0x867fcd in parse_options parse-options.c:870:10
    #9 0x56c62b in cmd_for_each_ref builtin/for-each-ref.c:59:2
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dabe in main common-main.c:52:11
    #15 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
f3a9680791 mailinfo: also free strbuf lists when clearing mailinfo
mailinfo.p_hdr_info/s_hdr_info are null-terminated lists of strbuf's,
with entries pointing either to NULL or an allocated strbuf. Therefore
we need to free those strbuf's (and not just the data they contain)
whenever we're done with a given entry. (See handle_header() where those
new strbufs are malloc'd.)

Once we no longer need the list (and not just its entries) we can switch
over to strbuf_list_free() instead of manually iterating over the list,
which takes care of those additional details for us. We can only do this
in clear_mailinfo() - in handle_commit_message() we are only clearing the
array contents but want to reuse the array itself, hence we can't use
strbuf_list_free() there.

However, strbuf_list_free() cannot handle a NULL input, and the lists we
are freeing might be NULL. Therefore we add a NULL check in
strbuf_list_free() to make it safe to use with a NULL input (which is a
pattern used by some of the other *_free() functions around git).

Leak output from t0023:

Direct leak of 72 byte(s) in 3 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ac9f4 in do_xmalloc wrapper.c:41:8
    #2 0x9ac9ca in xmalloc wrapper.c:62:9
    #3 0x7f6cf7 in handle_header mailinfo.c:205:10
    #4 0x7f5abf in check_header mailinfo.c:583:4
    #5 0x7f5524 in mailinfo mailinfo.c:1197:3
    #6 0x4dcc95 in parse_mail builtin/am.c:1167:6
    #7 0x4d9070 in am_run builtin/am.c:1732:12
    #8 0x4d5b7a in cmd_am builtin/am.c:2398:3
    #9 0x4cd91d in run_builtin git.c:467:11
    #10 0x4cb5f3 in handle_builtin git.c:719:3
    #11 0x4ccf47 in run_argv git.c:808:4
    #12 0x4caf49 in cmd_main git.c:939:19
    #13 0x69e43e in main common-main.c:52:11
    #14 0x7fc1fadfa349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 72 byte(s) leaked in 3 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
52a9436aa7 builtin/checkout: clear pending objects after diffing
add_pending_object() populates rev.pending, we need to take care of
clearing it once we're done.

This code is run close to the end of a checkout, therefore this leak
seems like it would have very little impact. See also LSAN output
from t0020 below:

Direct leak of 2048 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9acc46 in xrealloc wrapper.c:126:8
    #2 0x83e3a3 in add_object_array_with_path object.c:337:3
    #3 0x8f672a in add_pending_object_with_path revision.c:329:2
    #4 0x8eaeab in add_pending_object_with_mode revision.c:336:2
    #5 0x8eae9d in add_pending_object revision.c:342:2
    #6 0x5154a0 in show_local_changes builtin/checkout.c:602:2
    #7 0x513b00 in merge_working_tree builtin/checkout.c:979:3
    #8 0x512cb3 in switch_branches builtin/checkout.c:1242:9
    #9 0x50f8de in checkout_branch builtin/checkout.c:1646:9
    #10 0x50ba12 in checkout_main builtin/checkout.c:2003:9
    #11 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
    #12 0x4cd91d in run_builtin git.c:467:11
    #13 0x4cb5f3 in handle_builtin git.c:719:3
    #14 0x4ccf47 in run_argv git.c:808:4
    #15 0x4caf49 in cmd_main git.c:939:19
    #16 0x69e43e in main common-main.c:52:11
    #17 0x7f5dd1d50349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 2048 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
265644367f builtin/check-ignore: clear_pathspec before returning
parse_pathspec() allocates new memory into pathspec, therefore we need
to free it when we're done.

An UNLEAK would probably be just as good here - but clear_pathspec() is
not much more work so we might as well use it. check_ignore() is either
called once directly from cmd_check_ignore() (in which case the leak
really doesnt matter), or it can be called multiple times in a loop from
check_ignore_stdin_paths(), in which case we're potentially leaking
multiple times - but even in this scenario the leak is so small as to
have no real consequence.

Found while running t0008:

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9aca44 in do_xmalloc wrapper.c:41:8
    #2 0x9aca1a in xmalloc wrapper.c:62:9
    #3 0x873c17 in parse_pathspec pathspec.c:582:2
    #4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
    #5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
    #6 0x4cd91d in run_builtin git.c:467:11
    #7 0x4cb5f3 in handle_builtin git.c:719:3
    #8 0x4ccf47 in run_argv git.c:808:4
    #9 0x4caf49 in cmd_main git.c:939:19
    #10 0x69e43e in main common-main.c:52:11
    #11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9acc46 in xrealloc wrapper.c:126:8
    #2 0x93baed in strbuf_grow strbuf.c:98:2
    #3 0x93d696 in strbuf_vaddf strbuf.c:392:3
    #4 0x9400c6 in xstrvfmt strbuf.c:979:2
    #5 0x940253 in xstrfmt strbuf.c:989:8
    #6 0x92b72a in prefix_path_gently setup.c:115:15
    #7 0x87442d in init_pathspec_item pathspec.c:439:11
    #8 0x873cef in parse_pathspec pathspec.c:589:3
    #9 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
    #10 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
    #11 0x4cd91d in run_builtin git.c:467:11
    #12 0x4cb5f3 in handle_builtin git.c:719:3
    #13 0x4ccf47 in run_argv git.c:808:4
    #14 0x4caf49 in cmd_main git.c:939:19
    #15 0x69e43e in main common-main.c:52:11
    #16 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 2 byte(s) in 1 object(s) allocated from:
    #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ac9e8 in xstrdup wrapper.c:29:14
    #2 0x874542 in init_pathspec_item pathspec.c:468:20
    #3 0x873cef in parse_pathspec pathspec.c:589:3
    #4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
    #5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
    #6 0x4cd91d in run_builtin git.c:467:11
    #7 0x4cb5f3 in handle_builtin git.c:719:3
    #8 0x4ccf47 in run_argv git.c:808:4
    #9 0x4caf49 in cmd_main git.c:939:19
    #10 0x69e43e in main common-main.c:52:11
    #11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 179 byte(s) leaked in 3 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
4fa268738c builtin/bugreport: don't leak prefixed filename
prefix_filename() returns newly allocated memory, and strbuf_addstr()
doesn't take ownership of its inputs. Therefore we have to make sure to
store and free prefix_filename()'s result.

As this leak is in cmd_bugreport(), we could just as well UNLEAK the
prefix - but there's no good reason not to just free it properly. This
leak was found while running t0091, see output below:

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9acc66 in xrealloc wrapper.c:126:8
    #2 0x93baed in strbuf_grow strbuf.c:98:2
    #3 0x93c6ea in strbuf_add strbuf.c:295:2
    #4 0x69f162 in strbuf_addstr ./strbuf.h:304:2
    #5 0x69f083 in prefix_filename abspath.c:277:2
    #6 0x4fb275 in cmd_bugreport builtin/bugreport.c:146:9
    #7 0x4cd91d in run_builtin git.c:467:11
    #8 0x4cb5f3 in handle_builtin git.c:719:3
    #9 0x4ccf47 in run_argv git.c:808:4
    #10 0x4caf49 in cmd_main git.c:939:19
    #11 0x69df9e in main common-main.c:52:11
    #12 0x7f523a987349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
d895804b5a branch: FREE_AND_NULL instead of NULL'ing real_ref
real_ref was previously populated by dwim_ref(), which allocates new
memory. We need to make sure to free real_ref when discarding it.
(real_ref is already being freed at the end of create_branch() - but
if we discard it early then it will leak.)

This fixes the following leak found while running t0002-t0099:

Direct leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x486954 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xdd6484 in xstrdup wrapper.c:29:14
    #2 0xc0f658 in expand_ref refs.c:671:12
    #3 0xc0ecf1 in repo_dwim_ref refs.c:644:22
    #4 0x8b1184 in dwim_ref ./refs.h:162:9
    #5 0x8b0b02 in create_branch branch.c:284:10
    #6 0x550cbb in update_refs_for_switch builtin/checkout.c:1046:4
    #7 0x54e275 in switch_branches builtin/checkout.c:1274:2
    #8 0x548828 in checkout_branch builtin/checkout.c:1668:9
    #9 0x541306 in checkout_main builtin/checkout.c:2025:9
    #10 0x5395fa in cmd_checkout builtin/checkout.c:2077:8
    #11 0x4d02a8 in run_builtin git.c:467:11
    #12 0x4cbfe9 in handle_builtin git.c:719:3
    #13 0x4cf04f in run_argv git.c:808:4
    #14 0x4cb85a in cmd_main git.c:939:19
    #15 0x820cf6 in main common-main.c:52:11
    #16 0x7f30bd9dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
b180c681bb bloom: clear each bloom_key after use
fill_bloom_key() allocates memory into bloom_key, we need to clean that
up once the key is no longer needed.

This leak was found while running t0002-t0099. Although this leak is
happening in code being called from a test-helper, the same code is also
used in various locations around git, and can therefore happen during
normal usage too. Gabor's analysis shows that peak-memory usage during
'git commit-graph write' is reduced on the order of 10% for a selection
of larger repos (along with an even larger reduction if we override
modified path bloom filter limits):
https://lore.kernel.org/git/20210411072651.GF2947267@szeder.dev/

LSAN output:

Direct leak of 308 byte(s) in 11 object(s) allocated from:
    #0 0x49a5e2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x6f4032 in xcalloc wrapper.c:140:8
    #2 0x4f2905 in fill_bloom_key bloom.c:137:28
    #3 0x4f34c1 in get_or_compute_bloom_filter bloom.c:284:4
    #4 0x4cb484 in get_bloom_filter_for_commit t/helper/test-bloom.c:43:11
    #5 0x4cb072 in cmd__bloom t/helper/test-bloom.c:97:3
    #6 0x4ca7ef in cmd_main t/helper/test-tool.c:121:11
    #7 0x4caace in main common-main.c:52:11
    #8 0x7f798af95349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 308 byte(s) leaked in 11 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:44 +09:00
4c217a4c34 ls-files: free max_prefix when done
common_prefix() returns a new string, which we store in max_prefix -
this string needs to be freed to avoid a leak. This leak is happening
in cmd_ls_files, hence is of no real consequence - an UNLEAK would be
just as good, but we might as well free the string properly.

Leak found while running t0002, see output below:

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ab1b4 in do_xmalloc wrapper.c:41:8
    #2 0x9ab248 in do_xmallocz wrapper.c:75:8
    #3 0x9ab22a in xmallocz wrapper.c:83:9
    #4 0x9ab2d7 in xmemdupz wrapper.c:99:16
    #5 0x78d6a4 in common_prefix dir.c:191:15
    #6 0x5aca48 in cmd_ls_files builtin/ls-files.c:669:16
    #7 0x4cd92d in run_builtin git.c:453:11
    #8 0x4cb5fa in handle_builtin git.c:704:3
    #9 0x4ccf57 in run_argv git.c:771:4
    #10 0x4caf49 in cmd_main git.c:902:19
    #11 0x69ce2e in main common-main.c:52:11
    #12 0x7f64d4d94349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:44 +09:00
5493ce7af9 wt-status: fix multiple small leaks
rev.prune_data is populated (in multiple functions) via copy_pathspec,
and therefore needs to be cleared after running the diff in those
functions.

rev(_info).pending is populated indirectly via setup_revisions, and also
needs to be cleared once diffing is done.

These leaks were found while running t0008 or t0021. The rev.prune_data
leaks are small (80B) but noisy, hence I won't bother including their
logs - the rev.pending leaks are bigger, and can happen early in the
course of other commands, and therefore possibly more valuable to fix -
see example log from a rebase below:

Direct leak of 2048 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ac2a6 in xrealloc wrapper.c:126:8
    #2 0x83da03 in add_object_array_with_path object.c:337:3
    #3 0x8f5d8a in add_pending_object_with_path revision.c:329:2
    #4 0x8ea50b in add_pending_object_with_mode revision.c:336:2
    #5 0x8ea4fd in add_pending_object revision.c:342:2
    #6 0x8ea610 in add_head_to_pending revision.c:354:2
    #7 0x9b55f5 in has_uncommitted_changes wt-status.c:2474:2
    #8 0x9b58c4 in require_clean_work_tree wt-status.c:2553:6
    #9 0x606bcc in cmd_rebase builtin/rebase.c:1970:6
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dc0e in main common-main.c:52:11
    #15 0x7f2d18909349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ac048 in xstrdup wrapper.c:29:14
    #2 0x83da8d in add_object_array_with_path object.c:349:17
    #3 0x8f5d8a in add_pending_object_with_path revision.c:329:2
    #4 0x8ea50b in add_pending_object_with_mode revision.c:336:2
    #5 0x8ea4fd in add_pending_object revision.c:342:2
    #6 0x8ea610 in add_head_to_pending revision.c:354:2
    #7 0x9b55f5 in has_uncommitted_changes wt-status.c:2474:2
    #8 0x9b58c4 in require_clean_work_tree wt-status.c:2553:6
    #9 0x606bcc in cmd_rebase builtin/rebase.c:1970:6
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dc0e in main common-main.c:52:11
    #15 0x7f2d18909349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 2053 byte(s) leaked in 2 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:44 +09:00
db69bf608d revision: free remainder of old commit list in limit_list
limit_list() iterates over the original revs->commits list, and consumes
many of its entries via pop_commit. However we might stop iterating over
the list early (e.g. if we realise that the rest of the list is
uninteresting). If we do stop iterating early, list will be pointing to
the unconsumed portion of revs->commits - and we need to free this list
to avoid a leak. (revs->commits itself will be an invalid pointer: it
will have been free'd during the first pop_commit.)

However the list pointer is later reused to iterate over our new list,
but only for the limiting_can_increase_treesame() branch. We therefore
need to introduce a new variable for that branch - and while we're here
we can rename the original list to original_list as that makes its
purpose more obvious.

This leak was found while running t0090. It's not likely to be very
impactful, but it can happen quite early during some checkout
invocations, and hence seems to be worth fixing:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ac084 in do_xmalloc wrapper.c:41:8
    #2 0x9ac05a in xmalloc wrapper.c:62:9
    #3 0x7175d6 in commit_list_insert commit.c:540:33
    #4 0x71800f in commit_list_insert_by_date commit.c:604:9
    #5 0x8f8d2e in process_parents revision.c:1128:5
    #6 0x8f2f2c in limit_list revision.c:1418:7
    #7 0x8f210e in prepare_revision_walk revision.c:3577:7
    #8 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6
    #9 0x512f05 in switch_branches builtin/checkout.c:1250:3
    #10 0x50f8de in checkout_branch builtin/checkout.c:1646:9
    #11 0x50ba12 in checkout_main builtin/checkout.c:2003:9
    #12 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
    #13 0x4cd91d in run_builtin git.c:467:11
    #14 0x4cb5f3 in handle_builtin git.c:719:3
    #15 0x4ccf47 in run_argv git.c:808:4
    #16 0x4caf49 in cmd_main git.c:939:19
    #17 0x69dc0e in main common-main.c:52:11
    #18 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 48 byte(s) in 3 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ac084 in do_xmalloc wrapper.c:41:8
    #2 0x9ac05a in xmalloc wrapper.c:62:9
    #3 0x717de6 in commit_list_append commit.c:1609:35
    #4 0x8f1f9b in prepare_revision_walk revision.c:3554:12
    #5 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6
    #6 0x512f05 in switch_branches builtin/checkout.c:1250:3
    #7 0x50f8de in checkout_branch builtin/checkout.c:1646:9
    #8 0x50ba12 in checkout_main builtin/checkout.c:2003:9
    #9 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dc0e in main common-main.c:52:11
    #15 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:44 +09:00
3dd71461e2 hex: print objects using the hash algorithm member
Now that all code paths correctly set the hash algorithm member of
struct object_id, write an object's hex representation using the hash
algorithm member embedded in it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
b8505ecbf2 hex: default to the_hash_algo on zero algorithm value
There are numerous places in the codebase where we assume we can
initialize data by zeroing all its bytes.  However, when we do that with
a struct object_id, it leaves the structure with a zero value for the
algorithm, which is invalid.

We could forbid this pattern and require that all struct object_id
instances be initialized using oidclr, but this seems burdensome and
it's unnatural to most C programmers.  Instead, if the algorithm is
zero, assume we wanted to use the default hash algorithm instead.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
71b7672b67 builtin/pack-objects: avoid using struct object_id for pack hash
We use struct object_id for the names of objects.  It isn't intended to
be used for other hash values that don't name objects such as the pack
hash.

Because struct object_id will soon need to have its algorithm member
set, using it in this code path would mean that we didn't set that
member, only the hash member, which would result in a crash.  For both
of these reasons, switch to using an unsigned char array of size
GIT_MAX_RAWSZ.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
72871b132c commit-graph: don't store file hashes as struct object_id
The idea behind struct object_id is that it is supposed to represent the
identifier of a standard Git object or a special pseudo-object like the
all-zeros object ID.  In this case, we have file hashes, which, while
similar, are distinct from the identifiers of objects.

Switch these code paths to use an unsigned char array.  This is both
more logically consistent and it means that we need not set the
algorithm identifier for the struct object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
dd15f4f457 builtin/show-index: set the algorithm for object IDs
In most cases, when we load the hash of an object into a struct
object_id, we load it using one of the oid* or *_oid_hex functions.
However, for git show-index, we read it in directly using fread.  As a
consequence, set the algorithm correctly so the objects can be used
correctly both now and in the future.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
14228447c9 hash: provide per-algorithm null OIDs
Up until recently, object IDs did not have an algorithm member, only a
hash.  Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms.  Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.

Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
5a6dce70d7 hash: set, copy, and use algo field in struct object_id
Now that struct object_id has an algorithm field, we should populate it.
This will allow us to handle object IDs in any supported algorithm and
distinguish between them.  Ensure that the field is written whenever we
write an object ID by storing it explicitly every time we write an
object.  Set values for the empty blob and tree values as well.

In addition, use the algorithm field to compare object IDs.  Note that
because we zero-initialize struct object_id in many places throughout
the codebase, we default to the default algorithm in cases where the
algorithm field is zero rather than explicitly initialize all of those
locations.

This leads to a branch on every comparison, but the alternative is to
compare the entire buffer each time and padding the buffer for SHA-1.
That alternative ranges up to 3.9% worse than this approach on the perf
t0001, t1450, and t1451.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
0e5e2284f1 builtin/pack-redundant: avoid casting buffers to struct object_id
Now that we need our instances of struct object_id to be zero padded, we
can no longer cast unsigned char buffers to be pointers to struct
object_id.  This file reads data out of the pack objects and then
inserts it directly into a linked list item which is a pointer to struct
object_id.  Instead, let's have the linked list item hold its own struct
object_id and copy the data into it.

In addition, since these are not really pointers to struct object_id,
stop passing them around as such, and call them what they really are:
pointers to unsigned char.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
5951bf467e Use the final_oid_fn to finalize hashing of object IDs
When we're hashing a value which is going to be an object ID, we want to
zero-pad that value if necessary.  To do so, use the final_oid_fn
instead of the final_fn anytime we're going to create an object ID to
ensure we perform this operation.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
ab795f0d77 hash: add a function to finalize object IDs
To avoid the penalty of having to branch in hash comparison functions,
we'll want to always compare the full hash member in a struct object_id,
which will require that SHA-1 object IDs be zero-padded.  To do so, add
a function which finalizes a hash context and writes it into an object
ID that performs this padding.

Move the definition of struct object_id and the constant definitions
higher up so we they are available for us to use.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
c3b4e4ee36 http-push: set algorithm when reading object ID
In most places in the codebase, we use oidread to properly read an
object ID into a struct object_id.  However, in the HTTP code, we end up
needing to parse a loose object path with a slash in it, so we can't do
that.  Let's instead explicitly set the algorithm in this function so we
can rely on it in the future.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
92e2cab96b Always use oidread to read into struct object_id
In the future, we'll want oidread to automatically set the hash
algorithm member for an object ID we read into it, so ensure we use
oidread instead of hashcpy everywhere we're copying a hash value into a
struct object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
cf0983213c hash: add an algo member to struct object_id
Now that we're working with multiple hash algorithms in the same repo,
it's best if we label each object ID with its algorithm so we can
determine how to format a given object ID. Add a member called algo to
struct object_id.

Performance testing on object ID-heavy workloads doesn't reveal a clear
change in performance.  Out of performance tests t0001 and t1450, there
are slight variations in performance both up and down, but all
measurements are within the margin of error.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
b722d4560e pretty: provide human date format
Add the placeholders %ah and %ch to format author date and committer
date, like --date=human does, which provides more humanity date output.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:09:32 +09:00
3593ebd3f5 pretty tests: give --date/format tests a better description
Change the description for the --date/format equivalency tests added
in 466fb6742d (pretty: provide a strict ISO 8601 date format,
2014-08-29) and 0df621172d (pretty: provide short date format,
2019-11-19) to be more meaningful.

This allows us to reword the comment added in the former commit to
refer to both tests, and any other future test, such as the in-flight
--date=human format being proposed in [1].

1. http://lore.kernel.org/git/pull.939.v2.git.1619275340051.gitgitgadget@gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:08:54 +09:00
fbfcaec8d8 pretty tests: simplify %aI/%cI date format test
Change a needlessly complex test for the %aI/%cI date
formats (iso-strict) added in 466fb6742d (pretty: provide a strict
ISO 8601 date format, 2014-08-29) to instead use the same pattern used
to test %as/%cs since 0df621172d (pretty: provide short date format,
2019-11-19).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:05:56 +09:00
34c319970d refs/debug: trace into reflog expiry too
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:59:39 +09:00
7cdb096903 git-completion.bash: consolidate cases in _git_stash()
The $subcommand case statement in _git_stash() is quite repetitive.
Consolidate the cases together into one catch-all case to reduce the
repetition.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:41:07 +09:00
59d85a2a05 git-completion.bash: use $__git_cmd_idx in more places
With the introduction of the $__git_cmd_idx variable in e94fb44042
(git-completion.bash: pass $__git_subcommand_idx from __git_main(),
2021-03-24), completion functions were able to know the index at which
the git command is listed, allowing them to skip options that are given
to the underlying git itself, not the corresponding command (e.g.
`-C asdf` in `git -C asdf branch`).

While most of the changes here are self-explanatory, some bear further
explanation.

For the __git_find_on_cmdline() and __git_find_last_on_cmdline() pair of
functions, these functions are only ever called in the context of a git
command completion function. These functions will only care about words
after the command so we can safely ignore the words before this.

For _git_worktree(), this change is technically a no-op (once the
__git_find_last_on_cmdline change is also applied). It was in poor style
to have hard-coded on the index right after `worktree`. In case
`git worktree` were to ever learn to accept options, the current
situation would be inflexible.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:41:07 +09:00
87e629756f git-completion.bash: rename to $__git_cmd_idx
In e94fb44042 (git-completion.bash: pass $__git_subcommand_idx from
__git_main(), 2021-03-24), the $__git_subcommand_idx variable was
introduced. Naming it after the index of the subcommand is needlessly
confusing as, when this variable is used, it is in the completion
functions for commands (e.g. _git_remote()) where for `git remote add`,
the `remote` is referred to as the command and `add` is referred to as
the subcommand.

Rename this variable so that it's obvious it's about git commands. While
we're at it, shorten up its name so that it's still readable without
being a handful to type.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:41:07 +09:00
482d549906 t1300: fix unset of GIT_CONFIG_NOSYSTEM leaking into subsequent tests
In order to test whether the new GIT_CONFIG_SYSTEM environment variable
behaves as expected, we unset GIT_CONFIG_NOSYSTEM in one of our tests in
t1300. But because tests are not executed in a subshell, this unset
leaks into all subsequent tests and may thus cause them to fail in some
environments. These failures are easily reproducable with `make
prefix=/root test`.

Fix the issue by not using `sane_unset GIT_CONFIG_NOSYSTEM`, but instead
just manually add it to the environment of the two command invocations
which need it.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:15:34 +09:00
a2ba162cda object-info: support for retrieving object info
Sometimes it is useful to get information of an object without having to
download it completely.

Add the "object-info" capability that lets the client ask for
object-related information with their full hexadecimal object names.

Only sizes are returned for now.

Signed-off-by: Bruno Albuquerque <bga@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 17:41:13 -07:00
311531c9de The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 17:23:37 -07:00
4090b6973b Merge branch 'js/access-nul-emulation-on-windows'
Portability fix.

* js/access-nul-emulation-on-windows:
  msvc: avoid calling `access("NUL", flags)`
2021-04-20 17:23:37 -07:00
b9fa3ba0ca Merge branch 'sg/bugreport-fixes'
The dependencies for config-list.h and command-list.h were broken
when the former was split out of the latter, which has been
corrected.

* sg/bugreport-fixes:
  Makefile: add missing dependencies of 'config-list.h'
2021-04-20 17:23:37 -07:00
092bf77e8c Merge branch 'jc/doc-do-not-capitalize-clarification'
Doc update for developers.

* jc/doc-do-not-capitalize-clarification:
  doc: clarify "do not capitalize the first word" rule
2021-04-20 17:23:36 -07:00
fdef940afe Merge branch 'ab/usage-error-docs'
Documentation updates, with unrelated comment updates, too.

* ab/usage-error-docs:
  api docs: document that BUG() emits a trace2 error event
  api docs: document BUG() in api-error-handling.txt
  usage.c: don't copy/paste the same comment three times
2021-04-20 17:23:36 -07:00
522010b573 Merge branch 'ab/detox-gettext-tests'
Test clean-up.

* ab/detox-gettext-tests:
  tests: remove all uses of test_i18cmp
2021-04-20 17:23:36 -07:00
e02f75c9eb Merge branch 'jt/fetch-pack-request-fix'
* jt/fetch-pack-request-fix:
  fetch-pack: buffer object-format with other args
2021-04-20 17:23:36 -07:00
196cc525e2 Merge branch 'hn/reftable-tables-doc-update'
Doc updte.

* hn/reftable-tables-doc-update:
  reftable: document an alternate cleanup method on Windows
2021-04-20 17:23:35 -07:00
2eebac2c49 Merge branch 'jk/pack-objects-bitmap-progress-fix'
When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.

* jk/pack-objects-bitmap-progress-fix:
  pack-objects: update "nr_seen" progress based on pack-reused count
2021-04-20 17:23:35 -07:00
ab99efc817 Merge branch 'ab/userdiff-tests'
A bit of code clean-up and a lot of test clean-up around userdiff
area.

* ab/userdiff-tests:
  blame tests: simplify userdiff driver test
  blame tests: don't rely on t/t4018/ directory
  userdiff: remove support for "broken" tests
  userdiff tests: list builtin drivers via test-tool
  userdiff tests: explicitly test "default" pattern
  userdiff: add and use for_each_userdiff_driver()
  userdiff style: normalize pascal regex declaration
  userdiff style: declare patterns with consistent style
  userdiff style: re-order drivers in alphabetical order
2021-04-20 17:23:34 -07:00
6d7a62d74d Merge branch 'ar/userdiff-scheme'
Userdiff patterns for "Scheme" has been added.

* ar/userdiff-scheme:
  userdiff: add support for Scheme
2021-04-20 17:23:34 -07:00
8c8c8c0e16 git-completion.bash: separate some commands onto their own line
In e94fb44042 (git-completion.bash: pass $__git_subcommand_idx from
__git_main(), 2021-03-24), a line was introduced which contained
multiple statements. This is difficult to read so break it into multiple
lines.

While we're at it, follow this convention for the rest of the
__git_main() and break up lines that contain multiple statements.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 13:27:35 -07:00
9364bf465d doc: clarify the filename encoding in git diff
AFAICT parsing the output of `git diff --name-only master...feature`
is the intended way of programmatically getting the list of files
modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 12:57:26 -07:00
844c3f0b0b ref-filter: reuse output buffer
When we use `git for-each-ref`, every ref will allocate
its own output strbuf and error strbuf. But we can reuse
the final strbuf for each step ref's output. The error
buffer will also be reused, despite the fact that the git
will exit when `format_ref_array_item()` return a non-zero
value and output the contents of the error buffer.

The performance for `git for-each-ref` on the Git repository
itself with performance testing tool `hyperfine` changes from
23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms. Optimization is relatively
minor.

At the same time, we apply this optimization to `git tag -l`
and `git branch -l`.

This approach is similar to the one used by 79ed0a5
(cat-file: use a single strbuf for all output, 2018-08-14)
to speed up the cat-file builtin.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 11:09:50 -07:00
22f69a85ed ref-filter: get rid of show_ref_array_item
Inlining the exported function `show_ref_array_item()`,
which is not providing the right level of abstraction,
simplifies the API and can unlock improvements at the
former call sites.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 15:08:00 -07:00
68e66f2987 parallel-checkout: add design documentation
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 15:05:25 -07:00
4179b4897f config: allow overriding of global and system configuration
In order to have git run in a fully controlled environment without any
misconfiguration, it may be desirable for users or scripts to override
global- and system-level configuration files. We already have a way of
doing this, which is to unset both HOME and XDG_CONFIG_HOME environment
variables and to set `GIT_CONFIG_NOGLOBAL=true`. This is quite kludgy,
and unsetting the first two variables likely has an impact on other
executables spawned by such a script.

The obvious way to fix this would be to introduce `GIT_CONFIG_NOGLOBAL`
as an equivalent to `GIT_CONFIG_NOSYSTEM`. But in the past, it has
turned out that this design is inflexible: we cannot test system-level
parsing of the git configuration in our test harness because there is no
way to change its location, so all tests run with `GIT_CONFIG_NOSYSTEM`
set.

Instead of doing the same mistake with `GIT_CONFIG_NOGLOBAL`, introduce
two new variables `GIT_CONFIG_GLOBAL` and `GIT_CONFIG_SYSTEM`:

    - If unset, git continues to use the usual locations.

    - If set to a specific path, we skip reading the normal
      configuration files and instead take the path. By setting the path
      to `/dev/null`, no configuration will be loaded for the respective
      level.

This implements the usecase where we want to execute code in a sanitized
environment without any potential misconfigurations via `/dev/null`, but
is more flexible and allows for more usecases than simply adding
`GIT_CONFIG_NOGLOBAL`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:16:59 -07:00
1e06eb9b5d config: unify code paths to get global config paths
There's two callsites which assemble global config paths, once in the
config loading code and once in the git-config(1) builtin. We're about
to implement a way to override global config paths via an environment
variable which would require us to adjust both sites.

Unify both code paths into a single `git_global_config()` function which
returns both paths for `~/.gitconfig` and the XDG config file. This will
make the subsequent patch which introduces the new envvar easier to
implement.

No functional changes are expected from this patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:16:59 -07:00
c62a999c6e config: rename git_etc_config()
The `git_etc_gitconfig()` function retrieves the system-level path of
the configuration file. We're about to introduce a way to override it
via an environment variable, at which point the name of this function
would start to become misleading.

Rename the function to `git_system_config()` as a preparatory step.
While at it, the function is also refactored to pass memory ownership to
the caller. This is done to better match semantics of
`git_global_config()`, which is going to be introduced in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:16:59 -07:00
9cf68b27d5 rev-list: allow filtering of provided items
When providing an object filter, it is currently impossible to also
filter provided items. E.g. when executing `git rev-list HEAD` , the
commit this reference points to will be treated as user-provided and is
thus excluded from the filtering mechanism. This makes it harder than
necessary to properly use the new `--filter=object:type` filter given
that even if the user wants to only see blobs, he'll still see commits
of provided references.

Improve this by introducing a new `--filter-provided-objects` option
to the git-rev-parse(1) command. If given, then all user-provided
references will be subject to filtering.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:09:11 -07:00
169a15ebd6 pack-bitmap: implement combined filter
When the user has multiple objects filters specified, then this is
internally represented by having a "combined" filter. These combined
filters aren't yet supported by bitmap indices and can thus not be
accelerated.

Fix this by implementing support for these combined filters. The
implementation is quite trivial: when there's a combined filter, we
simply recurse into `filter_bitmap()` for all of the sub-filters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:09:11 -07:00
7ab6aafa58 pack-bitmap: implement object type filter
The preceding commit has added a new object filter for git-rev-list(1)
which allows to filter objects by type. Implement the equivalent filter
for packfile bitmaps so that we can answer these queries fast.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:09:11 -07:00
b0c42a53c9 list-objects: implement object type filter
While it already is possible to filter objects by some criteria in
git-rev-list(1), it is not yet possible to filter out only a specific
type of objects. This makes some filters less useful. The `blob:limit`
filter for example filters blobs such that only those which are smaller
than the given limit are returned. But it is unfit to ask only for these
smallish blobs, given that git-rev-list(1) will continue to print tags,
commits and trees.

Now that we have the infrastructure in place to also filter tags and
commits, we can improve this situation by implementing a new filter
which selects objects based on their type. Above query can thus
trivially be implemented with the following command:

    $ git rev-list --objects --filter=object:type=blob \
        --filter=blob:limit=200

Furthermore, this filter allows to optimize for certain other cases: if
for example only tags or commits have been selected, there is no need to
walk down trees.

The new filter is not yet supported in bitmaps. This is going to be
implemented in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:09:11 -07:00
1c4d6f46be parallel-checkout: support progress displaying
Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 11:57:05 -07:00
7531e4b66e parallel-checkout: add configuration options
Make parallel checkout configurable by introducing two new settings:
checkout.workers and checkout.thresholdForParallelism. The first defines
the number of workers (where one means sequential checkout), and the
second defines the minimum number of entries to attempt parallel
checkout.

To decide the default value for checkout.workers, the parallel version
was benchmarked during three operations in the linux repo, with cold
cache: cloning v5.8, checking out v5.8 from v2.6.15 (checkout I) and
checking out v5.8 from v5.7 (checkout II). The four tables below show
the mean run times and standard deviations for 5 runs in: a local file
system on SSD, a local file system on HDD, a Linux NFS server, and
Amazon EFS (all on Linux). Each parallel checkout test was executed with
the number of workers that brings the best overall results in that
environment.

Local SSD:
             Sequential             10 workers            Speedup
Clone        8.805 s ± 0.043 s      3.564 s ± 0.041 s     2.47 ± 0.03
Checkout I   9.678 s ± 0.057 s      4.486 s ± 0.050 s     2.16 ± 0.03
Checkout II  5.034 s ± 0.072 s      3.021 s ± 0.038 s     1.67 ± 0.03

Local HDD:
             Sequential             10 workers             Speedup
Clone        32.288 s ± 0.580 s     30.724 s ± 0.522 s    1.05 ± 0.03
Checkout I   54.172 s ±  7.119 s    54.429 s ± 6.738 s    1.00 ± 0.18
Checkout II  40.465 s ± 2.402 s     38.682 s ± 1.365 s    1.05 ± 0.07

Linux NFS server (v4.1, on EBS, single availability zone):

             Sequential             32 workers            Speedup
Clone        240.368 s ± 6.347 s    57.349 s ± 0.870 s    4.19 ± 0.13
Checkout I   242.862 s ± 2.215 s    58.700 s ± 0.904 s    4.14 ± 0.07
Checkout II  65.751 s ± 1.577 s     23.820 s ± 0.407 s    2.76 ± 0.08

EFS (v4.1, replicated over multiple availability zones):

             Sequential             32 workers            Speedup
Clone        922.321 s ± 2.274 s    210.453 s ± 3.412 s   4.38 ± 0.07
Checkout I   1011.300 s ± 7.346 s   297.828 s ± 0.964 s   3.40 ± 0.03
Checkout II  294.104 s ± 1.836 s    126.017 s ± 1.190 s   2.33 ± 0.03

The above benchmarks show that parallel checkout is most effective on
repositories located on an SSD or over a distributed file system. For
local file systems on spinning disks, and/or older machines, the
parallelism does not always bring a good performance. For this reason,
the default value for checkout.workers is one, a.k.a. sequential
checkout.

To decide the default value for checkout.thresholdForParallelism,
another benchmark was executed in the "Local SSD" setup, where parallel
checkout showed to be beneficial. This time, we compared the runtime of
a `git checkout -f`, with and without parallelism, after randomly
removing an increasing number of files from the Linux working tree. The
"sequential fallback" column below corresponds to the executions where
checkout.workers was 10 but checkout.thresholdForParallelism was equal
to the number of to-be-updated files plus one (so that we end up writing
sequentially). Each test case was sampled 15 times, and each sample had
a randomly different set of files removed. Here are the results:

             sequential fallback   10 workers           speedup
10   files    772.3 ms ± 12.6 ms   769.0 ms ± 13.6 ms   1.00 ± 0.02
20   files    780.5 ms ± 15.8 ms   775.2 ms ±  9.2 ms   1.01 ± 0.02
50   files    806.2 ms ± 13.8 ms   767.4 ms ±  8.5 ms   1.05 ± 0.02
100  files    833.7 ms ± 21.4 ms   750.5 ms ± 16.8 ms   1.11 ± 0.04
200  files    897.6 ms ± 30.9 ms   730.5 ms ± 14.7 ms   1.23 ± 0.05
500  files   1035.4 ms ± 48.0 ms   677.1 ms ± 22.3 ms   1.53 ± 0.09
1000 files   1244.6 ms ± 35.6 ms   654.0 ms ± 38.3 ms   1.90 ± 0.12
2000 files   1488.8 ms ± 53.4 ms   658.8 ms ± 23.8 ms   2.26 ± 0.12

From the above numbers, 100 files seems to be a reasonable default value
for the threshold setting.

Note: Up to 1000 files, we observe a drop in the execution time of the
parallel code with an increase in the number of files. This is a rather
odd behavior, but it was observed in multiple repetitions. Above 1000
files, the execution time increases according to the number of files, as
one would expect.

About the test environments: Local SSD tests were executed on an
i7-7700HQ (4 cores with hyper-threading) running Manjaro Linux. Local
HDD tests were executed on an Intel(R) Xeon(R) E3-1230 (also 4 cores
with hyper-threading), HDD Seagate Barracuda 7200.14 SATA 3.1, running
Debian. NFS and EFS tests were executed on an Amazon EC2 c5n.xlarge
instance, with 4 vCPUs. The Linux NFS server was running on a m6g.large
instance with 2 vCPUSs and a 1 TB EBS GP2 volume. Before each timing,
the linux repository was removed (or checked out back to its previous
state), and `sync && sysctl vm.drop_caches=3` was executed.

Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 11:57:05 -07:00
e9e8adf1a8 parallel-checkout: make it truly parallel
Use multiple worker processes to distribute the queued entries and call
write_pc_item() in parallel for them. The items are distributed
uniformly in contiguous chunks. This minimizes the chances of two
workers writing to the same directory simultaneously, which could affect
performance due to lock contention in the kernel. Work stealing (or any
other format of re-distribution) is not implemented yet.

The protocol between the main process and the workers is quite simple.
They exchange binary messages packed in pkt-line format, and use
PKT-FLUSH to mark the end of input (from both sides). The main process
starts the communication by sending N pkt-lines, each corresponding to
an item that needs to be written. These packets contain all the
necessary information to load, smudge, and write the blob associated
with each item. Then it waits for the worker to send back N pkt-lines
containing the results for each item. The resulting packet must contain:
the identification number of the item that it refers to, the status of
the operation, and the lstat() data gathered after writing the file (iff
the operation was successful).

For now, checkout always uses a hardcoded value of 2 workers, only to
demonstrate that the parallel checkout framework correctly divides and
writes the queued entries. The next patch will add user configurations
and define a more reasonable default, based on tests with the said
settings.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 11:57:05 -07:00
04155bdad8 unpack-trees: add basic support for parallel checkout
This new interface allows us to enqueue some of the entries being
checked out to later uncompress them, apply in-process filters, and
write out the files in parallel. For now, the parallel checkout
machinery is enabled by default and there is no user configuration, but
run_parallel_checkout() just writes the queued entries in sequence
(without spawning additional workers). The next patch will actually
implement the parallelism and, later, we will make it configurable.

Note that, to avoid potential data races, not all entries are eligible
for parallel checkout. Also, paths that collide on disk (e.g.
case-sensitive paths in case-insensitive file systems), are detected by
the parallel checkout code and skipped, so that they can be safely
sequentially handled later. The collision detection works like the
following:

- If the collision was at basename (e.g. 'a/b' and 'a/B'), the framework
  detects it by looking for EEXIST and EISDIR errors after an
  open(O_CREAT | O_EXCL) failure.

- If the collision was at dirname (e.g. 'a/b' and 'A'), it is detected
  at the has_dirs_only_path() check, which is done for the leading path
  of each item in the parallel checkout queue.

Both verifications rely on the fact that, before enqueueing an entry for
parallel checkout, checkout_entry() makes sure that there is no file at
the entry's path and that its leading components are all real
directories. So, any later change in these conditions indicates that
there was a collision (either between two parallel-eligible entries or
between an eligible and an ineligible one).

After all parallel-eligible entries have been processed, the collided
(and thus, skipped) entries are sequentially fed to checkout_entry()
again. This is similar to the way the current code deals with
collisions, overwriting the previously checked out entries with the
subsequent ones. The only difference is that, since we no longer create
the files in the same order that they appear on index, we are not able
to determine which of the colliding entries will survive on disk (for
the classic code, it is always the last entry).

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 11:57:05 -07:00
364bc11fe5 doc/diff-options: document new --diff-merges features
Document changes in -m and --diff-merges=m semantics, as well as new
--diff-merges=on option.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
17c13e60fd diff-merges: introduce log.diffMerges config variable
New log.diffMerges configuration variable sets the format that
--diff-merges=on will be using. The default is "separate".

t4013: add the following tests for log.diffMerges config:

* Test that wrong values are denied.

* Test that the value of log.diffMerges properly affects both
--diff-merges=on and -m.

t9902: fix completion tests for log.d* to match log.diffMerges.

Added documentation for log.diffMerges.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
38fc4dbbc2 diff-merges: adapt -m to enable default diff format
Let -m option (and --diff-merges=m) enable the default format instead
of "separate", to be able to tune it with log.diffMerges option.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
26a0f58da8 diff-merges: refactor set_diff_merges()
Split set_diff_merges() into separate parsing and execution functions,
the former to be reused for parsing of configuration values later in
the patch series.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
4320815eb9 diff-merges: introduce --diff-merges=on
Introduce the notion of default diff format for merges, and the option
"on" to select it. The default format is "separate" and can't yet
be changed, so effectively "on" is just a synonym for "separate"
for now. Add corresponding test to t4013.

This is in preparation for introducing log.diffMerges configuration
option that will let --diff-merges=on to be configured to any
supported format.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
b0c09ab879 The eleventh (aka "ort") batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:53:34 -07:00
257ae76ba9 Merge branch 'ah/merge-ort-ubsan-fix'
Code clean-up for merge-ort backend.

* ah/merge-ort-ubsan-fix:
  merge-ort: only do pointer arithmetic for non-empty lists
2021-04-16 13:53:34 -07:00
7bec8e7fa6 Merge branch 'en/ort-readiness'
Plug the ort merge backend throughout the rest of the system, and
start testing it as a replacement for the recursive backend.

* en/ort-readiness:
  Add testing with merge-ort merge strategy
  t6423: mark remaining expected failure under merge-ort as such
  Revert "merge-ort: ignore the directory rename split conflict for now"
  merge-recursive: add a bunch of FIXME comments documenting known bugs
  merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
  t: mark several submodule merging tests as fixed under merge-ort
  merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
  t6428: new test for SKIP_WORKTREE handling and conflicts
  merge-ort: support subtree shifting
  merge-ort: let renormalization change modify/delete into clean delete
  merge-ort: have ll_merge() use a special attr_index for renormalization
  merge-ort: add a special minimal index just for renormalization
  merge-ort: use STABLE_QSORT instead of QSORT where required
2021-04-16 13:53:34 -07:00
e2e1a03f6b Merge branch 'en/ort-perf-batch-10'
Various rename detection optimization to help "ort" merge strategy
backend.

* en/ort-perf-batch-10:
  diffcore-rename: determine which relevant_sources are no longer relevant
  merge-ort: record the reason that we want a rename for a file
  diffcore-rename: add computation of number of unknown renames
  diffcore-rename: check if we have enough renames for directories early on
  diffcore-rename: only compute dir_rename_count for relevant directories
  merge-ort: record the reason that we want a rename for a directory
  merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type
  diffcore-rename: take advantage of "majority rules" to skip more renames
2021-04-16 13:53:33 -07:00
76655e8a28 completion: avoid aliased command lookup error in nounset mode
Aliased command lookup accesses the `list` variable before it has been
set, causing an error in "nounset" mode. Initialize to an empty string
to avoid that.

    $ git nonexistent-command <Tab>bash: list: unbound variable

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:40:52 -07:00
32f67888d8 maintenance: respect remote.*.skipFetchAll
If a remote has the skipFetchAll setting enabled, then that remote is
not intended for frequent fetching. It makes sense to not fetch that
data during the 'prefetch' maintenance task. Skip that remote in the
iteration without error. The skip_default_update member is initialized
in remote.c:handle_config() as part of initializing the 'struct remote'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
cfd781ea22 maintenance: use 'git fetch --prefetch'
The 'prefetch' maintenance task previously forced the following refspec
for each remote:

	+refs/heads/*:refs/prefetch/<remote>/*

If a user has specified a more strict refspec for the remote, then this
prefetch task downloads more objects than necessary.

The previous change introduced the '--prefetch' option to 'git fetch'
which manipulates the remote's refspec to place all resulting refs into
refs/prefetch/, with further partitioning based on the destinations of
those refspecs.

Update the documentation to be more generic about the destination refs.
Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch/ remains.

Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
2e03115d0c fetch: add --prefetch option
The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.

Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:

 * Negative refspecs are preserved.
 * Refspecs without a destination are removed.
 * Refspecs whose source starts with "refs/tags/" are removed.
 * Other refspecs are placed within "refs/prefetch/".

Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.

There are some interesting cases that are worth testing.

An earlier version of this change dropped the "i--" from the loop that
deletes a refspec item and shifts the remaining entries down. This
allowed some refspecs to not be modified. The subtle part about the
first --prefetch test is that the "refs/tags/*" refspec appears directly
before the "refs/heads/bogus/*" refspec. Without that "i--", this
ordering would remove the "refs/tags/*" refspec and leave the last one
unmodified, placing the result in "refs/heads/*".

It is possible to have an empty refspec. This is typically the case for
remotes other than the origin, where users want to fetch a specific tag
or branch. To correctly test this case, we need to further remove the
upstream remote for the local branch. Thus, we are testing a refspec
that will be deleted, leaving nothing to fetch.

Helped-by: Tom Saeger <tom.saeger@oracle.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
9160068ac6 msvc: avoid calling access("NUL", flags)
Apparently this is not supported with Microsoft's Universal C Runtime.
So let's not actually do that.

Instead, just return success because we _know_ that we expect the `NUL`
device to be present.

Side note: it is possible to turn off the "Null device driver" and
thereby disable `NUL`. Too many things are broken if this driver is
disabled, therefore it is not worth bothering to try to detect its
presence when `access()` is called.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 12:05:32 -07:00
332ec963bc pkt-line: do not report packet write errors twice
On write() errors, packet_write() dies with the same error message that
is already printed by its callee, packet_write_gently(). This produces
an unnecessarily verbose and repetitive output:

error: packet write failed
fatal: packet write failed: <strerror() message>

In addition to that, packet_write_gently() does not always fulfill its
caller expectation that errno will be properly set before a non-zero
return. In particular, that is not the case for a "data exceeds max
packet size" error. So, in this case, packet_write() will call
die_errno() and print an strerror(errno) message that might be totally
unrelated to the actual error.

Fix both those issues by turning packet_write() and
packet_write_gently() into wrappers to a common lower level function
that doesn't print the error message, but instead returns it on a buffer
for the caller to die() or error() as appropriate.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-15 15:05:31 -07:00
d1b10fc6d8 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-15 13:36:01 -07:00
5a7e52bed2 Merge branch 'jz/apply-3way-cached'
"git apply" now takes "--3way" and "--cached" at the same time, and
work and record results only in the index.

* jz/apply-3way-cached:
  git-apply: allow simultaneous --cached and --3way options
2021-04-15 13:36:01 -07:00
b98db1dd70 Merge branch 'ab/complete-cherry-pick-head'
The command line completion (in contrib/) has learned that
CHERRY_PICK_HEAD is a possible pseudo-ref.

* ab/complete-cherry-pick-head:
  bash completion: complete CHERRY_PICK_HEAD
2021-04-15 13:36:01 -07:00
771c758e8a Merge branch 'jz/apply-run-3way-first'
"git apply --3way" has always been "to fall back to 3-way merge
only when straight application fails". Swap the order of falling
back so that 3-way is always attempted first (only when the option
is given, of course) and then straight patch application is used as
a fallback when it fails.

* jz/apply-run-3way-first:
  git-apply: try threeway first when "--3way" is used
2021-04-15 13:36:00 -07:00
f3cce896a8 transport: respect verbosity when setting upstream
A command such as `git push -qu origin feature` will print "Branch
'feature' set up to track remote branch 'feature' from 'origin'." even
when --quiet is passed. In this case it's because install_branch_config() is
always called with BRANCH_CONFIG_VERBOSE.

struct transport keeps track of the desired verbosity. Fix the above
issue by passing BRANCH_CONFIG_VERBOSE conditionally based on that.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-15 12:52:49 -07:00
151b6c2dd7 doc: clarify "do not capitalize the first word" rule
The same "do not capitalize the first word" rule is applied to both
our patch titles and error messages, but the existing description
was fuzzy in two aspects.

 * For error messages, it was not said that this was only about the
   first word that begins the sentence.

 * For both, it was not clear when a capital letter there was not an
   error.  We avoid capitalizing the first word when the only reason
   you would capitalize it is because it happens to be the first
   word in the sentence.  If a proper noun, which is usually spelled
   in capital letters, happens to come at the beginning of the
   sentence, it should be kept in capital letters.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 23:41:00 -07:00
4589bca829 name-hash: use expand_to_path()
A sparse-index loads the name-hash data for its entries, including the
sparse-directory entries. If a caller asks for a path that is contained
within a sparse-directory entry, we need to expand to a full index and
recalculate the name hash table before returning the result. Insert
calls to expand_to_path() to protect against this case.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:48:01 -07:00
71f82d032f sparse-index: expand_to_path()
Some users of the index API have a specific path they are looking for,
but choose to use index_file_exists() to rely on the name-hash hashtable
instead of doing binary search with index_name_pos(). These users only
need to know a yes/no answer, not a position within the cache array.

When the index is sparse, the name-hash hash table does not contain the
full list of paths within sparse directories. It _does_ contain the
directory names for the sparse-directory entries.

Create a helper function, expand_to_path(), for intended use with the
name-hash hashtable functions. The integration with name-hash.c will
follow in a later change.

The solution here is to use ensure_full_index() when we determine that
the requested path is within a sparse directory entry. This will
populate the name-hash hashtable as the index is recomputed from
scratch.

There may be cases where the caller is trying to find an untracked path
that is not in the index but also is not within a sparse directory
entry. We want to minimize the overhead for these requests. If we used
index_name_pos() to find the insertion order of the path, then we could
determine from that position if a sparse-directory exists. (In fact,
just calling index_name_pos() in that case would lead to expanding the
index to a full index.) However, this takes O(log N) time where N is the
number of cache entries.

To keep the performance of this call based mostly on the input string,
use index_file_exists() to look for the ancestors of the path. Using the
heuristic that a sparse directory is likely to have a small number of
parent directories, we start from the bottom and build up. Use a string
buffer to allow mutating the path name to terminate after each slash for
each hashset test.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:54 -07:00
5f11669586 name-hash: don't add directories to name_hash
Sparse directory entries represent a directory that is outside the
sparse-checkout definition. These are not paths to blobs, so should not
be added to the name_hash table. Instead, they should be added to the
directory hashtable when 'ignore_case' is true.

Add a condition to avoid placing sparse directories into the name_hash
hashtable. This avoids filling the table with extra entries that will
never be queried.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:51 -07:00
f5fed74fb2 revision: ensure full index
Before iterating over all index entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior. This case could
be integrated later by ensuring that we walk the tree in the
sparse-directory entry, but the current behavior is only expecting
blobs. Save this integration for later when it can be properly tested.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:48 -07:00
dc26b23ebc resolve-undo: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:45 -07:00
0c18c059a1 read-cache: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:42 -07:00
465a04abc6 pathspec: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:40 -07:00
f7ef64be0c merge-recursive: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:37 -07:00
3450a304aa entry: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:35 -07:00
d425f65127 dir: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:32 -07:00
2508df0272 update-index: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:29 -07:00
a02912019a stash: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:26 -07:00
e43e2a17d2 rm: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:24 -07:00
299e2c4561 merge-index: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:21 -07:00
42f44e84eb ls-files: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid missing files.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:17 -07:00
46eb6e31ef grep: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one so we do not miss blobs to scan. Later, this can
integrate more carefully with sparse indexes with proper testing.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:13 -07:00
2227ea175f fsck: ensure full index
When verifying all blobs reachable from the index, ensure that a sparse
index has been expanded to a full one to avoid missing some blobs.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:11 -07:00
48b3c7da6c difftool: ensure full index
Before iterating over all cache entries, ensure that a sparse index has
been expanded to a full one to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:09 -07:00
cb8388df5b commit: ensure full index
These two loops iterate over all cache entries, so ensure that a sparse
index is expanded to a full index before we do so.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:06 -07:00
0f6d3ba6bd checkout: ensure full index
Before iterating over all cache entries in the checkout builtin, ensure
that we have a full index to avoid any unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:03 -07:00
1b850d37f4 checkout-index: ensure full index
Before we iterate over all cache entries, ensure that the index is not
sparse. This loop in checkout_all() might be safe to iterate over a
sparse index, but let's put this protection here until it can be
carefully tested.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:59 -07:00
54beed24d2 add: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:48 -07:00
118a2e8bde cache: move ensure_full_index() to cache.h
Soon we will insert ensure_full_index() calls across the codebase.
Instead of also adding include statements for sparse-index.h, let's just
use the fact that anything that cares about the index already has
cache.h in its includes.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:41 -07:00
95e0321c4d read-cache: expand on query into sparse-directory entry
Callers to index_name_pos() or index_name_stage_pos() have a specific
path in mind. If that happens to be a path with an ancestor being a
sparse-directory entry, it can lead to unexpected results.

In the case that we did not find the requested path, check to see if the
position _before_ the inserted position is a sparse directory entry that
matches the initial segment of the input path (including the directory
separator at the end of the directory name). If so, then expand the
index to be a full index and search again. This expansion will only
happen once per index read.

Future enhancements could be more careful to expand only the necessary
sparse directory entry, but then we would have a special "not fully
sparse, but also not fully expanded" mode that could affect writing the
index to file. Since this only occurs if a specific file is requested
outside of the sparse checkout definition, this is unlikely to be a
common situation.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:30 -07:00
847a9e5d4f *: remove 'const' qualifier for struct index_state
Several methods specify that they take a 'struct index_state' pointer
with the 'const' qualifier because they intend to only query the data,
not change it. However, we will be introducing a step very low in the
method stack that might modify a sparse-index to become a full index in
the case that our queries venture inside a sparse-directory entry.

This change only removes the 'const' qualifiers that are necessary for
the following change which will actually modify the implementation of
index_name_stage_pos().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:00 -07:00
839a66349e sparse-index: API protection strategy
Edit and expand the sparse-index design document with the plan for
guarding index operations with ensure_full_index().

Notably, the plan has changed to not have an expand_to_path() method in
favor of checking for a sparse-directory hit inside of the
index_path_pos() API.

The changes that follow this one will incrementally add
ensure_full_index() guards to iterations over all cache entries. Some
iterations over the cache entries are not protected due to a few
categories listed in the document. Since these are not being modified,
here is a short list of the files and methods that will not receive
these guards:

Looking for non-zero stage:
* builtin/add.c:chmod_pathspec()
* builtin/merge.c:count_unmerged_entries()
* merge-ort.c:record_conflicted_index_entries()
* read-cache.c:unmerged_index()
* rerere.c:check_one_conflict(), find_conflict(), rerere_remaining()
* revision.c:prepare_show_merge()
* sequencer.c:append_conflicts_hint()
* wt-status.c:wt_status_collect_changes_initial()

Looking for submodules:
* builtin/submodule--helper.c:module_list_compute()
* submodule.c: several methods
* worktree.c:validate_no_submodules()

Part of the index API:
* name-hash.c: lazy init methods
* preload-index.c:preload_thread(), preload_index()
* read-cache.c: file format methods

Checking for correct order of cache entries:
* read-cache.c:check_ce_order()

Ignores SKIP_WORKTREE entries or already aware:
* unpack-trees.c:mark_new_skip_worktree()
* wt-status.c:wt_status_check_sparse_checkout()

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:45:34 -07:00
54a3917115 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 15:28:53 -07:00
e0d4a63c09 Merge branch 'vs/completion-with-set-u'
The command-line completion script (in contrib/) had a couple of
references that would have given a warning under the "-u" (nounset)
option.

* vs/completion-with-set-u:
  completion: audit and guard $GIT_* against unset use
2021-04-13 15:28:53 -07:00
e6545201ad Merge branch 'ab/detox-config-gettext'
The last remnant of gettext-poison has been removed.

* ab/detox-config-gettext:
  config.c: remove last remnant of GIT_TEST_GETTEXT_POISON
2021-04-13 15:28:53 -07:00
a9414b86ac Merge branch 'gk/gitweb-redacted-email'
"gitweb" learned "e-mail privacy" feature to redact strings that
look like e-mail addresses on various pages.

* gk/gitweb-redacted-email:
  gitweb: add "e-mail privacy" feature to redact e-mail addresses
2021-04-13 15:28:52 -07:00
8446b388b1 Merge branch 'cc/test-helper-bloom-usage-fix'
Usage message fix for a test helper.

* cc/test-helper-bloom-usage-fix:
  test-bloom: fix missing 'bloom' from usage string
2021-04-13 15:28:52 -07:00
2279289e95 Merge branch 'ab/send-email-validate-errors'
Clean-up codepaths that implements "git send-email --validate"
option and improves the message from it.

* ab/send-email-validate-errors:
  git-send-email: improve --validate error output
  git-send-email: refactor duplicate $? checks into a function
  git-send-email: test full --validate output
2021-04-13 15:28:51 -07:00
4c6ac2da2c Merge branch 'tb/precompose-prefix-simplify'
Streamline the codepath to fix the UTF-8 encoding issues in the
argv[] and the prefix on macOS.

* tb/precompose-prefix-simplify:
  macOS: precompose startup_info->prefix
  precompose_utf8: make precompose_string_if_needed() public
2021-04-13 15:28:51 -07:00
1d5fbd45c4 Merge branch 'fm/user-manual-use-preface'
Doc update to improve git.info

* fm/user-manual-use-preface:
  user-manual.txt: assign preface an id and a title
2021-04-13 15:28:51 -07:00
0623669fc6 Merge branch 'tb/pack-preferred-tips-to-give-bitmap'
A configuration variable has been added to force tips of certain
refs to be given a reachability bitmap.

* tb/pack-preferred-tips-to-give-bitmap:
  builtin/pack-objects.c: respect 'pack.preferBitmapTips'
  t/helper/test-bitmap.c: initial commit
  pack-bitmap: add 'test_bitmap_commits()' helper
2021-04-13 15:28:50 -07:00
7b55441db1 Merge branch 'ab/perl-do-not-abuse-map'
Perl critique.

* ab/perl-do-not-abuse-map:
  git-send-email: replace "map" in void context with "for"
2021-04-13 15:28:50 -07:00
f63add4aa8 Merge branch 'jk/ref-filter-segfault-fix'
A NULL-dereference bug has been corrected in an error codepath in
"git for-each-ref", "git branch --list" etc.

* jk/ref-filter-segfault-fix:
  ref-filter: fix NULL check for parse object failure
2021-04-13 15:28:50 -07:00
f6d25d7878 api docs: document that BUG() emits a trace2 error event
Correct documentation added in e544221d97 (trace2:
Documentation/technical/api-trace2.txt, 2019-02-22) to state that
calling BUG() also emits an "error" event. See ee4512ed48 (trace2:
create new combined trace facility, 2019-02-22) for the initial
implementation.

The BUG() function did not emit an event then however, that was only
changed later in 0a9dde4a04 (usage: trace2 BUG() invocations,
2021-02-05), that commit changed the code, but didn't update any of
the docs.

Let's also add a cross-reference from api-error-handling.txt.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 14:57:13 -07:00
4bf0c6f38f api docs: document BUG() in api-error-handling.txt
When the BUG() function was added in d8193743e0 (usage.c: add BUG()
function, 2017-05-12) these docs added in 1f23cfe0ef (doc: document
error handling functions and conventions, 2014-12-03) were not
updated. Let's do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 14:56:58 -07:00
c00c7382dd usage.c: don't copy/paste the same comment three times
In ee4512ed48 (trace2: create new combined trace facility,
2019-02-22) we started with two copies of this comment,
0ee10fd129 (usage: add trace2 entry upon warning(), 2020-11-23) added
a third. Let's instead add an earlier comment that applies to all
these mostly-the-same functions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 14:56:28 -07:00
feeb03bce6 tests: remove all uses of test_i18cmp
Finish the removal I started in 1108cea7f8 (tests: remove most uses
of test_i18ncmp, 2021-02-11). At that time the function wasn't removed
due to disruption with in-flight changes, remove the occurrences that
have landed since then.

As of writing this there are no test_i18ncmp uses between "master" and
"seen", so let's also remove the function to finally put it to rest.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 14:41:24 -07:00
c1fa951d7e revision: avoid parsing with --exclude-promisor-objects
When --exclude-promisor-objects is given, before traversing any objects
we iterate over all of the objects in any promisor packs, marking them
as UNINTERESTING and SEEN. We turn the oid we get from iterating the
pack into an object with parse_object(), but this has two problems:

  - it's slow; we are zlib inflating (and reconstructing from deltas)
    every byte of every object in the packfile

  - it leaves the tree buffers attached to their structs, which means
    our heap usage will grow to store every uncompressed tree
    simultaneously. This can be gigabytes.

We can obviously fix the second by freeing the tree buffers after we've
parsed them. But we can observe that the function doesn't look at the
object contents at all! The only reason we call parse_object() is that
we need a "struct object" on which to set the flags. There are two
options here:

  - we can look up just the object type via oid_object_info(), and then
    call the appropriate lookup_foo() function

  - we can call lookup_unknown_object(), which gives us an OBJ_NONE
    struct (which will get auto-converted later by object_as_type() via
    calls to lookup_commit(), etc).

The first one is closer to the current code, but we do pay the price to
look up the type for each object. The latter should be more efficient in
CPU, though it wastes a little bit of memory (the "unknown" object
structs are a union of all object types, so some of the structs are
bigger than they need to be). It also runs the risk of triggering a
latent bug in code that calls lookup_object() directly but isn't ready
to handle OBJ_NONE (such code would already be buggy, but we use
lookup_unknown_object() infrequently enough that it might be hiding).

I went with the second option here. I don't think the risk is high (and
we'd want to find and fix any such bugs anyway), and it should be more
efficient overall.

The new tests in p5600 show off the improvement (this is on git.git):

  Test                                 HEAD^               HEAD
  -------------------------------------------------------------------------------
  5600.5: count commits                0.37(0.37+0.00)     0.38(0.38+0.00) +2.7%
  5600.6: count non-promisor commits   11.74(11.37+0.37)   0.04(0.03+0.00) -99.7%

The improvement is particularly big in this script because _every_
object in the newly-cloned partial repo is a promisor object. So after
marking them all, there's nothing left to traverse.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:22:37 -07:00
45a187cc34 lookup_unknown_object(): take a repository argument
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.

We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:18:46 -07:00
fcc07e980b is_promisor_object(): free tree buffer after parsing
To get the list of all promisor objects, we not only include all objects
in promisor packs, but also parse each of those objects to see which
objects they reference. After parsing a tree object, the tree->buffer
field will remain populated until we explicitly free it. So in a partial
clone of blob:none, for example, we are essentially reading every tree
in the repository (since they're all in the initial promisor pack), and
keeping all of their uncompressed contents in memory at once.

This patch frees the tree buffers after we've finished marking all of
their reachable objects. We shouldn't need to do this for any other
object type. While we are using some extra memory to store the structs,
no other object type stores the whole contents in its parsed form (we do
sometimes hold on to commit buffers, but less so these days due to
commit graphs, plus most commands which care about promisor objects turn
off the save_commit_buffer global).

Even for a moderate-sized repository like git.git, this patch drops the
peak heap (as measured by massif) for git-fsck from ~1.7GB to ~138MB.
Fsck is a good candidate for measuring here because it doesn't interact
with the promisor code except to call is_promisor_object(), so we can
isolate just this problem.

The added perf test shows only a tiny improvement on my machine for
git.git, since 1.7GB isn't enough to cause any real memory pressure:

  Test                                 HEAD^               HEAD
  --------------------------------------------------------------------------------
  5600.4: fsck                         21.26(20.90+0.35)   20.84(20.79+0.04) -2.0%

With linux.git the absolute change is a bit bigger, though still a small
percentage:

  Test                          HEAD^                 HEAD
  -----------------------------------------------------------------------------
  5600.4: fsck                  262.26(259.13+3.12)   254.92(254.62+0.29) -2.8%

I didn't have the patience to run it under massif with linux.git, but
it's probably on the order of about 14GB improvement, since that's the
sum of the sizes of all of the uncompressed trees (but still isn't
enough to create memory pressure on this particular machine, which has
64GB of RAM). Smaller machines would probably see a bigger effect on
runtime (and sadly our perf suite does not measure peak heap).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:16:39 -07:00
2a2112a429 refs: print errno for read_raw_ref if GIT_TRACE_REFS is set
The ref backend API uses errno as a sideband error channel.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 14:42:37 -07:00
61a7660516 reftable: document an alternate cleanup method on Windows
The new method uses the update_index counter, which isn't susceptible to clock
inaccuracies.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 14:29:44 -07:00
4f4d2017a3 svn tests: refactor away a "set -e" in test body
Refactor a test added in 83c9433e67 (git-svn: support for git-svn
propset, 2014-12-07) to avoid using "set -e" in the test body. Let's
move this into a setup test using "test_expect_success" instead.

While I'm at it refactor:

 * Repeated "mkdir" to "mkdir -p"
 * Uses of "touch" to creating the files with ">" instead
 * The "rm -rf" at the end to happen in a "test_when_finished"

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 14:10:51 -07:00
88fce1219e svn tests: remove legacy re-setup from init-clone test
Remove the immediate "rm -rf .git" from the start of this test. This
was added back in 41337e22f0 (git-svn: add tests for command-line
usage of init and clone commands, 2007-11-17) when there was a "trash"
directory shared by all the tests, but ever since abc5d372ec (Enable
parallel tests, 2008-08-08) we've had per-test trash directories.

So this setup can simply be removed. We could use
TEST_NO_CREATE_REPO=true, but I don't think it's worth the effort to
go out of our way to be different. It doesn't matter that we now have
a redundant .git at the top-level.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 14:10:50 -07:00
8e118e8490 pack-objects: update "nr_seen" progress based on pack-reused count
When serving a clone or fetch with bitmaps, after deciding which objects
need to be sent our "pack reuse" mechanism kicks in: we try to send
more-or-less verbatim a bunch of objects from the beginning of the
bitmapped packfile without even adding them to the to_pack.objects
array.

After deciding which objects will be in the "reused" portion, we update
nr_result to account for those, and then trigger display_progress() to
show the user (who is undoubtedly dazzled that we managed to enumerate
so many objects so quickly).

But then something confusing happens: the "Enumerating objects" progress
meter jumps _backwards_, counting up from zero the number of objects we
actually add into to_pack.objects.

This worked correctly once upon a time, but was broken in 5af050437a
(pack-objects: show some progress when counting kept objects,
2018-04-15), when the latter half of that progress meter switched to
using a separate nr_seen counter, rather than nr_result. Nobody noticed
for two reasons:

  - prior to the pack-reuse fixes from a14aebeac3 (Merge branch
    'jk/packfile-reuse-cleanup', 2020-02-14), the reuse code almost
    never kicked in anyway

  - the output looks _kind of_ correct. The "backwards" moment is hard
    to catch, because we overwrite the old progress number with the new
    one, and the larger number is displayed only for a second. So unless
    you look at that exact second, you just see the much smaller value,
    counting up to the number of non-reused objects (though of course if
    you catch it in stderr, or look at GIT_TRACE_PACKET from a server
    with bitmaps, you can see both values).

This smaller output isn't wrong per se, but isn't counting what we ever
intended to. We should give the user the whole number of objects we
considered (which, as per 5af050437a's original purpose, is already
_not_ a count of what goes into to_pack.objects). The follow-on
"Counting objects" meter shows the actual number of objects we feed into
that array.

We can easily fix this by bumping (and showing) nr_seen for the
pack-reused objects. When the included test is run without this patch,
the second pack-objects invocation produces "Enumerating objects: 1" to
show the one loose object, even though the resulting pack has hundreds
of objects in it. With it, we jump to "Enumerating objects: 674" after
deciding on reuse, and then "675" when we add in the loose object.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 11:31:30 -07:00
c1ea48a8f7 merge-ort: only do pointer arithmetic for non-empty lists
versions could be an empty string_list. In that case, versions->items is
NULL, and we shouldn't be trying to perform pointer arithmetic with it (as
that results in undefined behaviour).

Moreover we only use the results of this calculation once when calling
QSORT. Therefore we choose to skip creating relevant_entries and call
QSORT directly with our manipulated pointers (but only if there's data
requiring sorting). This lets us avoid abusing the string_list API,
and saves us from having to explain why this abuse is OK.

Finally, an assertion is added to make sure that write_tree() is called
with a valid offset.

This issue has probably existed since:
  ee4012dcf9 (merge-ort: step 2 of tree writing -- function to create tree object, 2020-12-13)
But it only started occurring during tests since tests started using
merge-ort:
  f3b964a07e (Add testing with merge-ort merge strategy, 2021-03-20)

For reference - here's the original UBSAN commit that implemented this
check, it sounds like this behaviour isn't actually likely to cause any
issues (but we might as well fix it regardless):
https://reviews.llvm.org/D67122

UBSAN output from t3404 or t5601:

merge-ort.c:2669:43: runtime error: applying zero offset to null pointer
    #0 0x78bb53 in write_tree merge-ort.c:2669:43
    #1 0x7856c9 in process_entries merge-ort.c:3303:2
    #2 0x782317 in merge_ort_nonrecursive_internal merge-ort.c:3744:2
    #3 0x77feef in merge_incore_nonrecursive merge-ort.c:3853:2
    #4 0x6f6a5c in do_recursive_merge sequencer.c:640:3
    #5 0x6f6a5c in do_pick_commit sequencer.c:2221:9
    #6 0x6ef055 in single_pick sequencer.c:4814:9
    #7 0x6ef055 in sequencer_pick_revisions sequencer.c:4867:10
    #8 0x4fb392 in run_sequencer revert.c:225:9
    #9 0x4fa5b0 in cmd_revert revert.c:235:8
    #10 0x42abd7 in run_builtin git.c:453:11
    #11 0x429531 in handle_builtin git.c:704:3
    #12 0x4282fb in run_argv git.c:771:4
    #13 0x4282fb in cmd_main git.c:902:19
    #14 0x524b63 in main common-main.c:52:11
    #15 0x7fc2ca340349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #16 0x4072b9 in _start start.S:120

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior merge-ort.c:2669:43 in

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 10:38:10 -07:00
9a2a4f9544 list-objects: support filtering by tag and commit
Object filters currently only support filtering blobs or trees based on
some criteria. This commit lays the foundation to also allow filtering
of tags and commits.

No change in behaviour is expected from this commit given that there are
no filters yet for those object types.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 09:35:50 -07:00
414abf159f docs: fix linting issues due to incorrect relative section order
Re-order the sections of a few manual pages to be consistent with the
entirety of the rest of our documentation. This allows us to remove
the just-added whitelist of "bad" order from
lint-man-section-order.perl.

I'm doing that this way around so that code will be easy to dig up if
we'll need it in the future. I've intentionally not added some other
sections such as EXAMPLES to the list of known sections.

If we were to add that we'd find some out of order. Perhaps we'll want
to order those consistently as well in the future, at which point
whitelisting some of them might become handy again.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
ea8b9271b1 doc lint: lint relative section order
Add a linting script to check the relative order of the sections in
the documentation. We should have NAME, then SYNOPSIS, DESCRIPTION,
OPTIONS etc. in that order.

That holds true throughout our documentation, except for a few
exceptions which are hardcoded in the linting script.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
cafd9828e8 doc lint: lint and fix missing "GIT" end sections
Lint for and fix the three manual pages that were missing the standard
"Part of the linkgit:git[1] suite" end section.

We only do this for the man[157] section documents (we don't have
anything outside those sections), not files to be included,
howto *.txt files etc.

We could also add this to the existing (and then renamed)
lint-gitlink.perl, but I'm not doing that here.

Obviously all of that fits in one script, but I think for something
like this that's a one-off script with global variables it's much
harder to follow when a large part of your script is some if/else or
keeping/resetting of state simply to work around the script doing two
things instead of one.

Especially because in this case this script wants to process the file
as one big string, but lint-gitlink.perl wants to look at it one line
at a time. We could also consolidate this whole thing and
t/check-non-portable-shell.pl, but that one likes to join lines as
part of its shell parsing.

So let's just add another script, whole scaffolding is basically:

    use strict;
    use warnings;
    sub report { ... }
    my $code = 0;
    while (<>) { ... }
    exit $code;

We'd spend more lines effort trying to consolidate them than just
copying that around.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
d2c9908076 doc lint: fix bugs in, simplify and improve lint script
The lint-gitlink.perl script added in ab81411ced (ci: validate
"linkgit:" in documentation, 2016-05-04) was more complex than it
needed to be. It:

 - Was using File::Find to recursively find *.txt files in
   Documentation/, let's instead use the Makefile as a source of truth
   for *.txt files, and pass it down to the script.

 - We now don't lint linkgit:* in RelNotes/* or technical/*, which we
   shouldn't have been doing in the first place anyway.

 - When the doc-diff script was added in beb188e22a (add a script to
   diff rendered documentation, 2018-08-06) we started sometimes having
   a "git worktree" under Documentation/.

   This tree contains a full checkout of git.git, as a result the
   "lint" script would recurse into that, and lint any *.txt file
   found in that entire repository.

   In practice the only in-tree "linkgit" outside of the
   Documentation/ tree is contrib/contacts/git-contacts.txt and
   contrib/subtree/git-subtree.txt, so this wouldn't emit any errors

Now we instead simply trust the Makefile to give us *.txt files.
Since the Makefile also knows what sections each page should be in we
don't have to open the files ourselves and try to parse that out. As a
bonus this will also catch bugs with the section line in the files
themselves being incorrect.

The structure of the new script is mostly based on
t/check-non-portable-shell.pl. As an added bonus it will also use
pos() to print where the problems it finds are, e.g. given an issue
like:

    diff --git a/Documentation/git-cherry.txt b/Documentation/git-cherry.txt
    [...]
     and line numbers.  git-cherry therefore detects when commits have been
    -"copied" by means of linkgit:git-cherry-pick[1], linkgit:git-am[1] or
    -linkgit:git-rebase[1].
    +"copied" by means of linkgit:git-cherry-pick[2], linkgit:git-am[3] or
    +linkgit:git-rebase[4].

We'll now emit:

    git-cherry.txt:20: error: git-cherry-pick[2]: wrong section (should be 1), shown with 'HERE' below:
    git-cherry.txt:20:      '"copied" by means of linkgit:git-cherry-pick[2]' <-- HERE
    git-cherry.txt:20: error: git-am[3]: wrong section (should be 1), shown with 'HERE' below:
    git-cherry.txt:20:      '"copied" by means of linkgit:git-cherry-pick[2], linkgit:git-am[3]' <-- HERE
    git-cherry.txt:21: error: git-rebase[4]: wrong section (should be 1), shown with 'HERE' below:
    git-cherry.txt:21:      'linkgit:git-rebase[4]' <-- HERE

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
3951eeb6d9 doc lint: Perl "strict" and "warnings" in lint-gitlink.perl
Amend this script added in ab81411ced (ci: validate "linkgit:" in
documentation, 2016-05-04) to pass under "use strict", and add a "use
warnings" for good measure.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
19bcc73e70 Documentation/Makefile: make doc.dep dependencies a variable again
Re-introduce a variable to declare what *.txt files need to be
considered for the purposes of scouring files to generate a dependency
graph of includes.

When doc.dep was introduced in a5ae8e64cf (Fix documentation
dependency generation., 2005-11-07) we had such a variable called
TEXTFILES, but it was refactored away just a few commits after that in
fb612d54c1 (Documentation: fix dependency generation.,
2005-11-07). I'm planning to add more wildcards here, so let's bring
it back.

I'm not calling it TEXTFILES because we e.g. don't consider
Documentation/technical/*.txt when generating the graph (they don't
use includes). Let's instead call it DOC_DEP_TXT.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
824c621b76 Documentation/Makefile: make $(wildcard howto/*.txt) a var
Refactor occurrences of $(wildcard howto/*.txt) into a single
HOWTO_TXT variable for readability and consistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
e5b32bffd1 rebase: don't override --no-reschedule-failed-exec with config
Fix a bug in how --no-reschedule-failed-exec interacts with
rebase.rescheduleFailedExec=true being set in the config. Before this
change the --no-reschedule-failed-exec config option would be
overridden by the config.

This bug happened because of the particulars of how "rebase" works
v.s. most other git commands when it comes to parsing options and
config:

When we read the config and parse the CLI options we correctly prefer
the --no-reschedule-failed-exec option over
rebase.rescheduleFailedExec=true in the config. So far so good.

However the --reschedule-failed-exec option doesn't take effect when
the rebase starts (we'd just create a
".git/rebase-merge/reschedule-failed-exec" file if it was true). It
only takes effect when the exec command fails, at which point we'll
reschedule the failed "exec" command.

Since we only wrote out the positive
".git/rebase-merge/reschedule-failed-exec" under
--reschedule-failed-exec, but nothing with --no-reschedule-failed-exec
we'll forget that we asked not to reschedule failed "exec", and would
happily re-read the config and see that
rebase.rescheduleFailedExec=true is set.

So the config will effectively override the user having explicitly
disabled the option on the command-line.

Even more confusingly: Since rebase accepts different options based on
its state there wasn't even a way to get around this with "rebase
--continue --no-reschedule-failed-exec" (but you could of course set
the config with "rebase -c ...").

I think the least bad way out of this is to declare that for such
options and config whatever we decide at the beginning of the rebase
goes. So we'll now always create either a "reschedule-failed-exec" or
a "no-reschedule-failed-exec file at the start, not just the former if
we decided we wanted the feature.

With this new worldview you can no longer change the setting once a
rebase has started except by manually removing the state files
discussed above. I think making it work like that is the the least
confusing thing we can do.

In the future we might want to learn to change the setting in the
middle by combining "--edit-todo" with
"--[no-]reschedule-failed-exec", we currently don't support combining
those options, or any other way to change the state in the middle of
the rebase short of manually editing the files in
".git/rebase-merge/*".

The bug being fixed here originally came about because of a
combination of the behavior of the code added in d421afa0c6 (rebase:
introduce --reschedule-failed-exec, 2018-12-10) and the addition of
the config variable in 969de3ff0e (rebase: add a config option to
default to --reschedule-failed-exec, 2018-12-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:23:49 -07:00
cd663df710 rebase tests: camel-case rebase.rescheduleFailedExec consistently
Fix a test added in 906b63942a (rebase --am: ignore
rebase.rescheduleFailedExec, 2019-07-01) to camel-case the
configuration variable. This doesn't change the behavior of the test,
it's merely to help its human readers.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:23:48 -07:00
628d81be6c list-objects: move tag processing into its own function
Move processing of tags into its own function to make the logic easier
to extend when we're going to implement filtering for tags. No change in
behaviour is expected from this commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:03:20 -07:00
b2025da38b revision: mark commit parents as NOT_USER_GIVEN
The NOT_USER_GIVEN flag of an object marks whether a flag was explicitly
provided by the user or not. The most important use case for this is
when filtering objects: only objects that were not explicitly requested
will get filtered.

The flag is currently only set for blobs and trees, which has been fine
given that there are no filters for tags or commits currently. We're
about to extend filtering capabilities to add object type filter though,
which requires us to set up the NOT_USER_GIVEN flag correctly -- if it's
not set, the object wouldn't get filtered at all.

Mark unseen commit parents as NOT_USER_GIVEN when processing parents.
Like this, explicitly provided parents stay user-given and thus
unfiltered, while parents which get loaded as part of the graph walk
can be filtered.

This commit shouldn't have any user-visible impact yet as there is no
logic to filter commits yet.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:03:20 -07:00
a812789c26 uploadpack.txt: document implication of uploadpackfilter.allow
When `uploadpackfilter.allow` is set to `true`, it means that filters
are enabled by default except in the case where a filter is explicitly
disabled via `uploadpackilter.<filter>.allow`. This option will not only
enable the currently supported set of filters, but also any filters
which get added in the future. As such, an admin which wants to have
tight control over which filters are allowed and which aren't probably
shouldn't ever set `uploadpackfilter.allow=true`.

Amend the documentation to make the ramifications more explicit so that
admins are aware of this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:03:19 -07:00
6871d0cec6 fetch-pack: refactor command and capability write
A subsequent commit will need this functionality independent of the rest
of send_fetch_request(), so put this into its own function.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 21:50:22 -07:00
57c3451b2e fetch-pack: refactor add_haves()
A subsequent commit will need part, but not all, of the functionality in
add_haves(), so move some of its functionality to its sole caller
send_fetch_request().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 21:50:21 -07:00
8102570374 fetch-pack: refactor process_acks()
A subsequent commit will need part, but not all, of the functionality in
process_acks(), so move some of its functionality to its sole caller
do_fetch_pack_v2(). As a side effect, the resulting code is also
shorter.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 21:50:21 -07:00
6db01a7308 Merge branch 'jt/fetch-pack-request-fix' into jt/push-negotiation
* jt/fetch-pack-request-fix:
  fetch-pack: buffer object-format with other args
2021-04-08 21:50:10 -07:00
81ed96a9b2 fetch-pack: buffer object-format with other args
In send_fetch_request(), "object-format" is written directly to the file
descriptor, as opposed to the other arguments, which are buffered.
Buffer "object-format" as well. "object-format" must be buffered; in
particular, it must appear after "command=fetch" in the request.

This divergence was introduced in 4b831208bb ("fetch-pack: parse and
advertise the object-format capability", 2020-05-27), perhaps as an
oversight (the surrounding code at the point of this commit has already
been using a request buffer.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 21:49:47 -07:00
0996dd3d6d gitweb: add "e-mail privacy" feature to redact e-mail addresses
Gitweb extracts content from the Git log and makes it accessible
over HTTP. As a result, e-mail addresses found in commits are
exposed to web crawlers and they may not respect robots.txt.
This can result in unsolicited messages.

Introduce an 'email-privacy' feature which redacts e-mail addresses
from the generated HTML content. Specifically, obscure addresses
retrieved from the the author/committer and comment sections of the
Git log. The feature is off by default.

This feature does not prevent someone from downloading the
unredacted commit log, e.g., by cloning the repository, and
extracting information from it. It aims to hinder the low-
effort, bulk collection of e-mail addresses by web crawlers.

Signed-off-by: Georgios Kontaxis <geko1702+commits@99rst.org>
Acked-by: Eric Wong <e@80x24.org>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 15:54:26 -07:00
56550ea718 Makefile: add missing dependencies of 'config-list.h'
We auto-generate the list of supported configuration variables from
'Documentation/config/*.txt', and that list used to be created by the
'generate-cmdlist.sh' helper script and stored in the 'command-list.h'
header.  Commit 709df95b78 (help: move list_config_help to
builtin/help, 2020-04-16) extracted this into a dedicated
'generate-configlist.sh' script and 'config-list.h' header, and added
a new target in the 'Makefile' as well, but while doing so it forgot
to extract the dependencies of the latter.  Consequently, since then
'config-list.h' is not re-generated when 'Documentation/config/*.txt'
is updated, while 'command-list.h' is re-generated unnecessarily:

  $ touch Documentation/config/log.txt
  $ make -j4
      GEN command-list.h
      CC help.o
      AR libgit.a

Fix this and list all config-related documentation files as
dependencies of 'config-list.h' and remove them from the dependencies
of 'command-list.h'.

  $ touch Documentation/config/log.txt
  $ make
      GEN config-list.h
      CC builtin/help.o
      LINK git

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 15:04:58 -07:00
d5f4b8260f rm: honor sparse checkout patterns
`git add` refrains from adding or updating index entries that are
outside the current sparse checkout, but `git rm` doesn't follow the
same restriction. This is somewhat counter-intuitive and inconsistent.
So make `rm` honor the sparsity rules and advise on how to remove
SKIP_WORKTREE entries just like `add` does. Also add some tests for the
new behavior.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
a20f70478f add: warn when asked to update SKIP_WORKTREE entries
`git add` already refrains from updating SKIP_WORKTREE entries, but it
silently exits with zero code when it is asked to do so. Instead, let's
warn the user and display a hint on how to update these entries.

Note that we only warn the user whey they give a pathspec item that
matches no eligible path for updating, but it does match one or more
SKIP_WORKTREE entries. A warning was chosen over erroring out right away
to reproduce the same behavior `add` already exhibits with ignored
files. This also allow users to continue their workflow without having
to invoke `add` again with only the eligible paths (as those will have
already been added).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
b243012cb3 refresh_index(): add flag to ignore SKIP_WORKTREE entries
refresh_index() doesn't update SKIP_WORKTREE entries, but it still
matches them against the given pathspecs, marks the matches on the
seen[] array, check if unmerged, etc. In the following patch, one caller
will need refresh_index() to ignore SKIP_WORKTREE entries entirely, so
add a flag that implements this behavior.

While we are here, also realign the REFRESH_* flags and convert the hex
values to the more natural bit shift format, which makes it easier to
spot holes.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
719630eb48 pathspec: allow to ignore SKIP_WORKTREE entries on index matching
Add a new enum parameter to `add_pathspec_matches_against_index()` and
`find_pathspecs_matching_against_index()`, allowing callers to specify
whether these function should attempt to match SKIP_WORKTREE entries or
not. This will be used in a future patch to make `git add` display a
warning when it is asked to update SKIP_WORKTREE entries.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
d73dbafc2c add: make --chmod and --renormalize honor sparse checkouts
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
6594afc3cc t3705: add tests for git add in sparse checkouts
We already have a couple tests for `add` with SKIP_WORKTREE entries in
t7012, but these only cover the most basic scenarios. As we will be
changing how `add` deals with sparse paths in the subsequent commits,
let's move these two tests to their own file and add more test cases
for different `add` options and situations. This also demonstrates two
options that don't currently respect SKIP_WORKTREE entries: `--chmod`
and `--renormalize`.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
4e95698349 add: include magic part of pathspec on --refresh error
When `git add --refresh <pathspec>` doesn't find any matches for the
given pathspec, it prints an error message using the `match` field of
the `struct pathspec_item`. However, this field doesn't contain the
magic part of the pathspec. Instead, let's use the `original` field.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
a437390310 userdiff: add support for Scheme
Add a diff driver for Scheme-like languages which recognizes top level
and local `define` forms, whether it is a function definition, binding,
syntax definition or a user-defined `define-xyzzy` form.

Also supports R6RS `library` forms, `module` forms along with class and
struct declarations used in Racket (PLT Scheme).

Alternate "def" syntax such as those in Gerbil Scheme are also
supported, like defstruct, defsyntax and so on.

The rationale for picking `define` forms for the hunk headers is because
it is usually the only significant form for defining the structure of
the program, and it is a common pattern for schemers to have local
function definitions to hide their visibility, so it is not only the top
level `define`'s that are of interest. Schemers also extend the language
with macros to provide their own define forms (for example, something
like a `define-test-suite`) which is also captured in the hunk header.

Since it is common practice to extend syntax with variants of a form
like `module+`, `class*` etc, those have been supported as well.

The word regex is a best-effort attempt to conform to R7RS[1] valid
identifiers, symbols and numbers.

[1] https://small.r7rs.org/attachment/r7rs.pdf (section 2.1)

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 13:56:09 -07:00
89b43f80a5 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 13:23:26 -07:00
14cc08de23 Merge branch 'ab/make-tags-quiet'
Generate [ec]tags under $(QUIET_GEN).

* ab/make-tags-quiet:
  Makefile: add QUIET_GEN to "tags" and "TAGS" targets
2021-04-08 13:23:26 -07:00
bde35a2a93 Merge branch 'rs/daemon-sanitize-dir-sep'
"git daemon" has been tightened against systems that take backslash
as directory separator.

* rs/daemon-sanitize-dir-sep:
  daemon: sanitize all directory separators
2021-04-08 13:23:26 -07:00
1b31224e59 Merge branch 'en/ort-perf-batch-9'
The ort merge backend has been optimized by skipping irrelevant
renames.

* en/ort-perf-batch-9:
  diffcore-rename: avoid doing basename comparisons for irrelevant sources
  merge-ort: skip rename detection entirely if possible
  merge-ort: use relevant_sources to filter possible rename sources
  merge-ort: precompute whether directory rename detection is needed
  merge-ort: introduce wrappers for alternate tree traversal
  merge-ort: add data structures for an alternate tree traversal
  merge-ort: precompute subset of sources for which we need rename detection
  diffcore-rename: enable filtering possible rename sources
2021-04-08 13:23:26 -07:00
82fd285e46 Merge branch 'en/sequencer-edit-upon-conflict-fix'
"git cherry-pick/revert" with or without "--[no-]edit" did not spawn
the editor as expected (e.g. "revert --no-edit" after a conflict
still asked to edit the message), which has been corrected.

* en/sequencer-edit-upon-conflict-fix:
  sequencer: fix edit handling for cherry-pick and revert messages
2021-04-08 13:23:26 -07:00
e6b971fcf5 Merge branch 'tb/reverse-midx'
An on-disk reverse-index to map the in-pack location of an object
back to its object name across multiple packfiles is introduced.

* tb/reverse-midx:
  midx.c: improve cache locality in midx_pack_order_cmp()
  pack-revindex: write multi-pack reverse indexes
  pack-write.c: extract 'write_rev_file_order'
  pack-revindex: read multi-pack reverse indexes
  Documentation/technical: describe multi-pack reverse indexes
  midx: make some functions non-static
  midx: keep track of the checksum
  midx: don't free midx_name early
  midx: allow marking a pack as preferred
  t/helper/test-read-midx.c: add '--show-objects'
  builtin/multi-pack-index.c: display usage on unrecognized command
  builtin/multi-pack-index.c: don't enter bogus cmd_mode
  builtin/multi-pack-index.c: split sub-commands
  builtin/multi-pack-index.c: define common usage with a macro
  builtin/multi-pack-index.c: don't handle 'progress' separately
  builtin/multi-pack-index.c: inline 'flags' with options
2021-04-08 13:23:25 -07:00
22eee7f455 Merge branch 'll/clone-reject-shallow'
"git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.

* ll/clone-reject-shallow:
  builtin/clone.c: add --reject-shallow option
2021-04-08 13:23:25 -07:00
f08b4013c3 blame tests: simplify userdiff driver test
Simplify the test added in 9466e3809d (blame: enable funcname blaming
with userdiff driver, 2020-11-01) to use the --author support recently
added in 999cfc4f45 (test-lib functions: add --author support to
test_commit, 2021-01-12).

We also did not need the full fortran-external-function content. Let's
cut it down to just the important parts.

I'm modifying it to demonstrate that the fortran-specific userdiff
function is in effect by adding "DO NOT MATCH ..." and "AS THE ..."
lines surrounding the "RIGHT" one.

This is to check that we're using the userdiff "fortran" driver, as
opposed to the default driver which would match on those lines as part
of the general heuristic of matching a line that doesn't begin with
whitespace.

The test had also been leaving behind a .gitattributes file for later
tests to possibly trip over, let's clean it up with
"test_when_finished".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
b269441be2 blame tests: don't rely on t/t4018/ directory
Refactor a test added in 9466e3809d (blame: enable funcname blaming
with userdiff driver, 2020-11-01) so that the blame tests don't rely
on stealing the contents of "t/t4018/fortran-external-function".

I have another patch series that'll possibly (or not) refactor that
file, but having this test inter-dependency makes things simple in any
case by making this test more readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
6cb77966ec userdiff: remove support for "broken" tests
There have been no "broken" tests since 75c3b6b2e8 (userdiff: improve
Fortran xfuncname regex, 2020-08-12). Let's remove the test support
for them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
28e8f0d5e5 userdiff tests: list builtin drivers via test-tool
Change the userdiff test to list the builtin drivers via the
test-tool, using the new for_each_userdiff_driver() API function.

This gets rid of the need to modify this part of the test every time a
new pattern is added, see 2ff6c34612 (userdiff: support Bash,
2020-10-22) and 09dad9256a (userdiff: support Markdown, 2020-05-02)
for two recent examples.

I only need the "list-builtin-drivers "argument here, but let's add
"list-custom-drivers" and "list-drivers" too, just because it's easy.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
132bf25989 userdiff tests: explicitly test "default" pattern
Since 122aa6f9c0 (diff: introduce diff.<driver>.binary, 2008-10-05)
the internals of the userdiff.c code have understood a "default" name,
which is invoked as userdiff_find_by_name("default") and present in
the "builtin_drivers" struct. Let's test for this special case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
f12fa9ee6c userdiff: add and use for_each_userdiff_driver()
Refactor the userdiff_find_by_namelen() function so that a new
for_each_userdiff_driver() API function does most of the work.

This will be useful for the same reason we've got other for_each_*()
API functions as part of various APIs, and will be used in a follow-up
commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
82512e008c userdiff style: normalize pascal regex declaration
Declare the pascal pattern consistently with how we declare the
others, not having "\n" on one line by itself, but as part of the
pattern, and when there are alterations have the "|" at the start, not
end of the line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:09 -07:00
6d1c9c527e userdiff style: declare patterns with consistent style
Change those patterns which were declared with a regex on the same
line as the "PATTERNS()" line to put that regex on the next line, and
add missing "/* -- */" separator comments between the pattern and
word_regex.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:09 -07:00
ddd164d026 userdiff style: re-order drivers in alphabetical order
Address some old code smell and move around the built-in userdiff
drivers so they're both in alphabetical order, and now in the same
order they appear in the gitattributes(5) documentation.

The two started drifting in be58e70dba (diff: unify external diff and
funcname parsing code, 2008-10-05), and then even further in
80c49c3de2 (color-words: make regex configurable via attributes,
2009-01-17) when the "cpp" pattern was added.

There are no functional changes here, and as --color-moved will show
only moved existing lines.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:09 -07:00
39e12650d7 config.c: remove last remnant of GIT_TEST_GETTEXT_POISON
Remove a use of GIT_TEST_GETTEXT_POISON added in f276e2a469 (config:
improve error message for boolean config, 2021-02-11).

This was simultaneously in-flight with my d162b25f95 (tests: remove
support for GIT_TEST_GETTEXT_POISON, 2021-01-20) which removed the
rest of the GIT_TEST_GETTEXT_POISON code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 10:54:08 -07:00
c5c0548d79 completion: audit and guard $GIT_* against unset use
$GIT_COMPLETION_SHOW_ALL and $GIT_TESTING_ALL_COMMAND_LIST were used
without guarding against them being unset, causing errors in nounset
(set -u) mode.

No other nounset-unsafe $GIT_* usages were found.

While at it, remove a superfluous (duplicate) unset guard from $GIT_DIR
in __git_find_repo_path.

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 10:45:36 -07:00
c0c2a37ac2 git-apply: allow simultaneous --cached and --3way options
"git apply" does not allow "--cached" and "--3way" to be used
together, since "--3way" writes conflict markers into the working
tree.

Allow "git apply" to accept "--cached" and "--3way" at the same
time.  When a single file auto-resolves cleanly, the result is
placed in the index at stage #0 and the command exits with 0 status.

For a file that has a conflict which cannot be cleanly
auto-resolved, the original contents from common ancestor (stage
conflict at the content level, and the command exists with non-zero
status, because there is no place (like the working tree) to leave a
half-resolved merge for the user to resolve.

The user can use `git diff` to view the contents of the conflict, or
`git checkout -m -- .` to regenerate the conflict markers in the
working directory.

Don't attempt rerere in this case since it depends on conflict
markers written to file for its database storage and lookup. There
would be two main changes required to get rerere working:

1. Allow the rerere api to accept in memory object rather than
   files, which would allow us to pass in the conflict markers
   contained in the result from ll_merge().

2. Rerere can't write to the working directory, so it would have to
   apply the result to cache stage #0 directly. A flag would be
   needed to control this.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-07 22:20:33 -07:00
a0dda6023e The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-07 16:54:09 -07:00
5644419d04 Merge branch 'ab/fsck-api-cleanup'
Fsck API clean-up.

* ab/fsck-api-cleanup:
  fetch-pack: use new fsck API to printing dangling submodules
  fetch-pack: use file-scope static struct for fsck_options
  fetch-pack: don't needlessly copy fsck_options
  fsck.c: move gitmodules_{found,done} into fsck_options
  fsck.c: add an fsck_set_msg_type() API that takes enums
  fsck.c: pass along the fsck_msg_id in the fsck_error callback
  fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h
  fsck.c: give "FOREACH_MSG_ID" a more specific name
  fsck.c: undefine temporary STR macro after use
  fsck.c: call parse_msg_type() early in fsck_set_msg_type()
  fsck.h: re-order and re-assign "enum fsck_msg_type"
  fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum
  fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type"
  fsck.c: rename remaining fsck_msg_id "id" to "msg_id"
  fsck.c: remove (mostly) redundant append_msg_id() function
  fsck.c: rename variables in fsck_set_msg_type() for less confusion
  fsck.h: use "enum object_type" instead of "int"
  fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT}
  fsck.c: refactor and rename common config callback
2021-04-07 16:54:09 -07:00
d637a267d8 Merge branch 'cc/downcase-opt-help'
A few option description strings started with capital letters,
which were corrected.

* cc/downcase-opt-help:
  column, range-diff: downcase option description
2021-04-07 16:54:09 -07:00
3cf14f88de Merge branch 'js/security-md'
SECURITY.md that is facing individual contributors and end users
has been introduced.  Also a procedure to follow when preparing
embargoed releases has been spelled out.

* js/security-md:
  Document how we do embargoed releases
  SECURITY: describe how to report vulnerabilities
2021-04-07 16:54:09 -07:00
58840e62a4 Merge branch 'ps/pack-bitmap-optim'
Optimize "rev-list --use-bitmap-index --objects" corner case that
uses negative tags as the stopping points.

* ps/pack-bitmap-optim:
  pack-bitmap: avoid traversal of objects referenced by uninteresting tag
2021-04-07 16:54:09 -07:00
68e15e0c23 Merge branch 'zh/commit-trailer'
"git commit" learned "--trailer <key>[=<value>]" option; together
with the interpret-trailers command, this will make it easier to
support custom trailers.

* zh/commit-trailer:
  commit: add --trailer option
2021-04-07 16:54:08 -07:00
a548f3e0ad Merge branch 'js/cmake-vsbuild'
CMake update for vsbuild.

* js/cmake-vsbuild:
  cmake(install): include vcpkg dlls
  cmake: add a preparatory work-around to accommodate `vcpkg`
  cmake(install): fix double .exe suffixes
  cmake: support SKIP_DASHED_BUILT_INS
2021-04-07 16:54:08 -07:00
573c5e50ab Merge branch 'ds/clarify-hashwrite'
The hashwrite() API uses a buffering mechanism to avoid calling
write(2) too frequently. This logic has been refactored to be
easier to understand.

* ds/clarify-hashwrite:
  csum-file: make hashwrite() more readable
2021-04-07 16:54:08 -07:00
642a40019c Merge branch 'ah/plugleaks'
Plug or annotate remaining leaks that trigger while running the
very basic set of tests.

* ah/plugleaks:
  transport: also free remote_refs in transport_disconnect()
  parse-options: don't leak alias help messages
  parse-options: convert bitfield values to use binary shift
  init-db: silence template_dir leak when converting to absolute path
  init: remove git_init_db_config() while fixing leaks
  worktree: fix leak in dwim_branch()
  clone: free or UNLEAK further pointers when finished
  reset: free instead of leaking unneeded ref
  symbolic-ref: don't leak shortened refname in check_symref()
2021-04-07 16:54:08 -07:00
3994ae510e bash completion: complete CHERRY_PICK_HEAD
When e.g. in a failed cherry pick we did not recognize
CHERRY_PICK_HEAD as we do e.g. REBASE_HEAD in a failed rebase let's
rectify that.

When REBASE_HEAD was added in fbd7a23237 (rebase: introduce and use
pseudo-ref REBASE_HEAD, 2018-02-11) a completion was added for it, but
no corresponding completion existed for CHERRY_PICK_HEAD added in
d7e5c0cbfb (Introduce CHERRY_PICK_HEAD, 2011-02-19).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-07 15:14:51 -07:00
923cd87ac8 git-apply: try threeway first when "--3way" is used
The apply_fragments() method of "git apply"
can silently apply patches incorrectly if
a file has repeating contents. In these
cases a three-way merge is capable of applying
it correctly in more situations, and will
show a conflict rather than applying it
incorrectly. However, because the patches
apply "successfully" using apply_fragments(),
git will never fall back to the merge, even
if the "--3way" flag is used, and the user has
no way to ensure correctness by forcing the
three-way merge method.

Change the behavior so that when "--3way" is used,
git will always try the three-way merge first and
will only fall back to apply_fragments() in cases
where blobs are not available or some other error
(but not in the case of a merge conflict).

Since user-facing results will be different,
this has backwards compatibility implications
for users depending on the old behavior. In
addition, the three-way merge will be slower
than direct patch application.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 17:11:41 -07:00
a039a1fcf9 maintenance: simplify prefetch logic
The previous logic filled a string list with the names of each remote,
but instead we could simply run the appropriate 'git fetch' data
directly in the remote iterator. Do this for reduced code size, but also
because it sets up an upcoming change to use the remote's refspec. This
data is accessible from the 'struct remote' data that is now accessible
in fetch_remote().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 14:23:47 -07:00
ea7811b37e git-send-email: improve --validate error output
Improve the output we emit on --validate error to:

 * Say "FILE:LINE" instead of "FILE: LINE", to match "grep -n",
   compiler error messages etc.

 * Don't say "patch contains a" after just mentioning the filename,
   just leave it at "FILE:LINE: is longer than[...]. The "contains a"
   sounded like we were talking about the file in general, when we're
   actually checking it line-by-line.

 * Don't just say "rejected by sendemail-validate hook", but combine
   that with the system_or_msg() output to say what exit code the hook
   died with.

I had an aborted attempt to make the line length checker note all
lines that were longer than the limit. I didn't think that was worth
the effort, but I've left in the testing change to check that we die
as soon as we spot the first long line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 12:57:06 -07:00
d21616c039 git-send-email: refactor duplicate $? checks into a function
Refactor the duplicate checking of $? into a function. There's an
outstanding series[1] wanting to add a third use of system() in this
file, let's not copy this boilerplate anymore when that happens.

1. http://lore.kernel.org/git/87y2esg22j.fsf@evledraar.gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 12:57:06 -07:00
e585210e1b git-send-email: test full --validate output
Change the tests that grep substrings out of the output to use a full
test_cmp, in preparation for improving the output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 12:57:05 -07:00
dba94e3a85 test-bloom: fix missing 'bloom' from usage string
Like 'get_murmur3' and 'generate_filter', 'get_filter_for_commit' is a
subcommand of `test-tool bloom` not of `test-tool` itself.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-05 22:54:34 -07:00
c7d0e61016 macOS: precompose startup_info->prefix
The "prefix" was precomposed for macOS in commit 5c327502 (MacOS:
precompose_argv_prefix(), 2021-02-03).

However, this commit forgot to update "startup_info->prefix" after
precomposing.

Move the (possible) precomposition towards the end of
setup_git_directory_gently(), so that precompose_string_if_needed()
can use git_config_get_bool("core.precomposeunicode") correctly.

Keep prefix, startup_info->prefix and GIT_PREFIX_ENVIRONMENT all in sync.

And as a result, the prefix no longer needs to be precomposed in git.c

Reported-by: Dmitry Torilov <d.torilov@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-05 17:30:36 -07:00
5020774aef precompose_utf8: make precompose_string_if_needed() public
commit 5c327502 (MacOS: precompose_argv_prefix(), 2021-02-03) uses
the function precompose_string_if_needed() internally.  It is only
used from precompose_argv_prefix() and therefore static in
compat/precompose_utf8.c

Expose this function, it will be used in the next commit.

While there, allow passing a NULL pointer, which will return NULL.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-05 17:30:04 -07:00
fc12b6fdde user-manual.txt: assign preface an id and a title
Two among the three warnings raised by "make git.info" are related to the fact
that the preface has not id in user-manual.txt.

    user-manual.texi:15: warning: empty menu entry name in `* : idm4.'
    user-manual.texi:141: warning: @unnumbered missing argument

This causes asciidoc creating an empty preface and an empty title tag in
user-manual.xml which turns to be an empty node in user-manual.texi and
git.info. Consequently, one can notice in user-manual.texi and git.info
a node named "idm4" in the menu and the navigation bar. In emacs, the
first entry of the menu in the git info page is even displayed as empty.

This fix will name "Introduction" the preface and assign it an id.
The result can be seen in the files: user-manual.{xml, texi, html, pdf}
and git.info.

For future reference, the diff between old and new user-manual.xml,
user-manual.texi, git.info, user-manual.html (converted through
html2markdown) and user-manual.pdf (converted through pdftotext) are
attached.

    --- before/user-manual.xml	2021-04-04 03:58:47.758008722 +0200
    +++ after/user-manual.xml	2021-04-04 03:56:40.520551163 +0200
    @@ -7,8 +7,8 @@
     <bookinfo>
         <title>Git User Manual</title>
     </bookinfo>
    -<preface>
    -<title></title>
    +<preface id="_introduction">
    +<title>Introduction</title>
     <simpara>Git is a fast distributed revision control system.</simpara>
     <simpara>This manual is designed to be readable by someone with basic UNIX
     command-line skills, but no previous knowledge of Git.</simpara>

    --- before/user-manual.texi	2021-04-04 03:58:47.490005652 +0200
    +++ after/user-manual.texi	2021-04-04 03:56:40.520551163 +0200
    @@ -7,12 +7,12 @@
     * Git: (git).           A fast distributed revision control system
     @end direntry

    -@node Top, idm4, , (dir)
    +@node Top, Introduction, , (dir)
     @documentlanguage en
     @top Git User Manual

     @menu
    -* : idm4.
    +* Introduction::
     * Repositories and Branches::
     * Exploring Git history::
     * Developing with Git::
    @@ -137,8 +137,8 @@
     @end detailmenu
     @end menu

    -@node idm4, Repositories and Branches, Top, Top
    -@unnumbered
    +@node Introduction, Repositories and Branches, Top, Top
    +@unnumbered Introduction

     Git is a fast distributed revision control system.

    @@ -178,7 +178,7 @@
     Finally, see @ref{Notes and todo list for this manual} for ways that you can help make this manual more
     complete.

    -@node Repositories and Branches, Exploring Git history, idm4, Top
    +@node Repositories and Branches, Exploring Git history, Introduction, Top
     @chapter Repositories and Branches

     @menu

    --- before/git.info	2021-04-04 03:58:46.557994966 +0200
    +++ after/git.info	2021-04-04 03:56:40.520551163 +0200
    @@ -7,14 +7,14 @@
     END-INFO-DIR-ENTRY

    -File: git.info,  Node: Top,  Next: idm4,  Up: (dir)
    +File: git.info,  Node: Top,  Next: Introduction,  Up: (dir)

     Git User Manual
     ***************

     * Menu:

    -* : idm4.
    +* Introduction::
     * Repositories and Branches::
     * Exploring Git history::
     * Developing with Git::
    @@ -137,7 +137,10 @@

    -File: git.info,  Node: idm4,  Next: Repositories and Branches,  Prev: Top,  Up: Top
    +File: git.info,  Node: Introduction,  Next: Repositories and Branches,  Prev: Top,  Up: Top
    +
    +Introduction
    +************

     Git is a fast distributed revision control system.

    @@ -174,7 +177,7 @@
     that you can help make this manual more complete.

    -File: git.info,  Node: Repositories and Branches,  Next: Exploring Git history,  Prev: idm4,  Up: Top
    +File: git.info,  Node: Repositories and Branches,  Next: Exploring Git history,  Prev: Introduction,  Up: Top

     1 Repositories and Branches
     ***************************
    @@ -5471,207 +5474,207 @@
    ...
     Tag Table:
     Node: Top212
    -Node: idm43164
    -Node: Repositories and Branches4465
    ...
    +Node: Introduction3179
    +Node: Repositories and Branches4515
    +Node: How to get a Git repository5128
    ...
    End Tag Table

    --- before/user-manual.html.md	2021-04-04 05:20:55.378695854 +0200
    +++ after/user-manual.html.md	2021-04-04 05:21:11.282850802 +0200
    @@ -4,6 +4,8 @@

      **Table of Contents**

    +Introduction
    +
     1\. Repositories and Branches

    @@ -278,7 +280,7 @@

     Todo list

    -#
    +# Introduction

     Git is a fast distributed revision control system.

    --- before/user-manual.pdf.txt	2021-04-04 05:28:20.367036836 +0200
    +++ after/user-manual.pdf.txt	2021-04-04 05:30:01.680026312 +0200
    @@ -487,6 +487,7 @@

     vii

    +Introduction
     Git is a fast distributed revision control system.
     This manual is designed to be readable by someone with basic UNIX command-line skills, but no previous knowledge of Git.
     Chapter 1 and Chapter 2 explain how to fetch and study a project using git—read these chapters to learn how to build and test a

Signed-off-by: Firmin Martin <firminmartin24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-03 23:19:04 -07:00
2e36527f23 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-02 14:43:31 -07:00
8a4394d1c1 Merge branch 'zh/format-patch-fractional-reroll-count'
"git format-patch -v<n>" learned to allow a reroll count that is
not an integer.

* zh/format-patch-fractional-reroll-count:
  format-patch: allow a non-integral version numbers
2021-04-02 14:43:14 -07:00
861794b60d Merge branch 'jh/simple-ipc'
A simple IPC interface gets introduced to build services like
fsmonitor on top.

* jh/simple-ipc:
  t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
  simple-ipc: add Unix domain socket implementation
  unix-stream-server: create unix domain socket under lock
  unix-socket: disallow chdir() when creating unix domain sockets
  unix-socket: add backlog size option to unix_stream_listen()
  unix-socket: eliminate static unix_stream_socket() helper function
  simple-ipc: add win32 implementation
  simple-ipc: design documentation for new IPC mechanism
  pkt-line: add options argument to read_packetized_to_strbuf()
  pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
  pkt-line: do not issue flush packets in write_packetized_*()
  pkt-line: eliminate the need for static buffer in packet_write_gently()
2021-04-02 14:43:14 -07:00
c47679d040 Merge branch 'mt/parallel-checkout-part-1'
Preparatory API changes for parallel checkout.

* mt/parallel-checkout-part-1:
  entry: add checkout_entry_ca() taking preloaded conv_attrs
  entry: move conv_attrs lookup up to checkout_entry()
  entry: extract update_ce_after_write() from write_entry()
  entry: make fstat_output() and read_blob_entry() public
  entry: extract a header file for entry.c functions
  convert: add classification for conv_attrs struct
  convert: add get_stream_filter_ca() variant
  convert: add [async_]convert_to_working_tree_ca() variants
  convert: make convert_attrs() and convert structs public
2021-04-02 14:43:14 -07:00
b362acf575 git-send-email: replace "map" in void context with "for"
While using "map" instead of "for" or "map" instead of "grep" and
vice-versa makes for interesting trivia questions when interviewing
Perl programmers, it doesn't make for very readable code. Let's
refactor this loop initially added in 8fd5bb7f44 (git send-email: add
--annotate option, 2008-11-11) to be a for-loop instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-02 14:32:29 -07:00
3c80fcb591 Makefile: add QUIET_GEN to "tags" and "TAGS" targets
Don't show the very verbose $(FIND_SOURCE_FILES) command on every
"make TAGS" invocation.

Let's use "generate into temporary and rename to the final file,
after seeing the command that generated the output finished
successfully" pattern, to avoid leaving a file with an incorrect
output generated by a failed command.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 22:23:39 -07:00
3007752461 midx.c: improve cache locality in midx_pack_order_cmp()
There is a lot of pointer dereferencing in the pre-image version of
'midx_pack_order_cmp()', which this patch gets rid of.

Instead of comparing the pack preferred-ness and then the pack id, both
of these checks are done at the same time by using the high-order bit of
the pack id to represent whether it's preferred. Then the pack id and
offset are compared as usual.

This produces the same result so long as there are less than 2^31 packs,
which seems like a likely assumption to make in practice.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
38ff7cabb6 pack-revindex: write multi-pack reverse indexes
Implement the writing half of multi-pack reverse indexes. This is
nothing more than the format describe a few patches ago, with a new set
of helper functions that will be used to clear out stale .rev files
corresponding to old MIDXs.

Unfortunately, a very similar comparison function as the one implemented
recently in pack-revindex.c is reimplemented here, this time accepting a
MIDX-internal type. An effort to DRY these up would create more
indirection and overhead than is necessary, so it isn't pursued here.

Currently, there are no callers which pass the MIDX_WRITE_REV_INDEX
flag, meaning that this is all dead code. But, that won't be the case
for long, since subsequent patches will introduce the multi-pack bitmap,
which will begin passing this field.

(In midx.c:write_midx_internal(), the two adjacent if statements share a
conditional, but are written separately since the first one will
eventually also handle the MIDX_WRITE_BITMAP flag, which does not yet
exist.)

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
a587b5a786 pack-write.c: extract 'write_rev_file_order'
Existing callers provide the reverse index code with an array of 'struct
pack_idx_entry *'s, which is then sorted by pack order (comparing the
offsets of each object within the pack).

Prepare for the multi-pack index to write a .rev file by providing a way
to write the reverse index without an array of pack_idx_entry (which the
MIDX code does not have).

Instead, callers can invoke 'write_rev_index_positions()', which takes
an array of uint32_t's. The ith entry in this array specifies the ith
object's (in index order) position within the pack (in pack order).

Expose this new function for use in a later patch, and rewrite the
existing write_rev_file() in terms of this new function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
f894081dea pack-revindex: read multi-pack reverse indexes
Implement reading for multi-pack reverse indexes, as described in the
previous patch.

Note that these functions don't yet have any callers, and won't until
multi-pack reachability bitmaps are introduced in a later patch series.
In the meantime, this patch implements some of the infrastructure
necessary to support multi-pack bitmaps.

There are three new functions exposed by the revindex API:

  - load_midx_revindex(): loads the reverse index corresponding to the
    given multi-pack index.

  - midx_to_pack_pos() and pack_pos_to_midx(): these convert between the
    multi-pack index and pseudo-pack order.

load_midx_revindex() and pack_pos_to_midx() are both relatively
straightforward.

load_midx_revindex() needs a few functions to be exposed from the midx
API. One to get the checksum of a midx, and another to get the .rev's
filename. Similar to recent changes in the packed_git struct, three new
fields are added to the multi_pack_index struct: one to keep track of
the size, one to keep track of the mmap'd pointer, and another to point
past the header and at the reverse index's data.

pack_pos_to_midx() simply reads the corresponding entry out of the
table.

midx_to_pack_pos() is the trickiest, since it needs to find an object's
position in the psuedo-pack order, but that order can only be recovered
in the .rev file itself. This mapping can be implemented with a binary
search, but note that the thing we're binary searching over isn't an
array of values, but rather a permuted order of those values.

So, when comparing two items, it's helpful to keep in mind the
difference. Instead of a traditional binary search, where you are
comparing two things directly, here we're comparing a (pack, offset)
tuple with an index into the multi-pack index. That index describes
another (pack, offset) tuple, and it is _those_ two tuples that are
compared.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
b25fd24c00 Documentation/technical: describe multi-pack reverse indexes
As a prerequisite to implementing multi-pack bitmaps, motivate and
describe the format and ordering of the multi-pack reverse index.

The subsequent patch will implement reading this format, and the patch
after that will implement writing it while producing a multi-pack index.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
62f2c1b509 midx: make some functions non-static
In a subsequent commit, pack-revindex.c will become responsible for
sorting a list of objects in the "MIDX pack order" (which will be
defined in the following patch). To do so, it will need to be know the
pack identifier and offset within that pack for each object in the MIDX.

The MIDX code already has functions for doing just that
(nth_midxed_offset() and nth_midxed_pack_int_id()), but they are
statically declared.

Since there is no reason that they couldn't be exposed publicly, and
because they are already doing exactly what the caller in
pack-revindex.c will want, expose them publicly so that they can be
reused there.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
9f19161172 midx: keep track of the checksum
write_midx_internal() uses a hashfile to write the multi-pack index, but
discards its checksum. This makes sense, since nothing that takes place
after writing the MIDX cares about its checksum.

That is about to change in a subsequent patch, when the optional
reverse index corresponding to the MIDX will want to include the MIDX's
checksum.

Store the checksum of the MIDX in preparation for that.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
7240cc4b65 midx: don't free midx_name early
A subsequent patch will need to refer back to 'midx_name' later on in
the function. In fact, this variable is already free()'d later on, so
this makes the later free() no longer redundant.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
9218c6a40c midx: allow marking a pack as preferred
When multiple packs in the multi-pack index contain the same object, the
MIDX machinery must make a choice about which pack it associates with
that object. Prior to this patch, the lowest-ordered[1] pack was always
selected.

Pack selection for duplicate objects is relatively unimportant today,
but it will become important for multi-pack bitmaps. This is because we
can only invoke the pack-reuse mechanism when all of the bits for reused
objects come from the reuse pack (in order to ensure that all reused
deltas can find their base objects in the same pack).

To encourage the pack selection process to prefer one pack over another
(the pack to be preferred is the one a caller would like to later use as
a reuse pack), introduce the concept of a "preferred pack". When
provided, the MIDX code will always prefer an object found in a
preferred pack over any other.

No format changes are required to store the preferred pack, since it
will be able to be inferred with a corresponding MIDX bitmap, by looking
up the pack associated with the object in the first bit position (this
ordering is described in detail in a subsequent commit).

[1]: the ordering is specified by MIDX internals; for our purposes we
can consider the "lowest ordered" pack to be "the one with the
most-recent mtime.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
4fe788b1b0 builtin/clone.c: add --reject-shallow option
In some scenarios, users may want more history than the repository
offered for cloning, which happens to be a shallow repository, can
give them. But because users don't know it is a shallow repository
until they download it to local, we may want to refuse to clone
this kind of repository, without creating any unnecessary files.

The '--depth=x' option cannot be used as a solution; the source may
be deep enough to give us 'x' commits when cloned, but the user may
later need to deepen the history to arbitrary depth.

Teach '--reject-shallow' option to "git clone" to abort as soon as
we find out that we are cloning from a shallow repository.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 12:58:58 -07:00
c685450880 ref-filter: fix NULL check for parse object failure
After we run parse_object_buffer() to get an object's contents, we try
to check that the return value wasn't NULL. However, since our "struct
object" is a pointer-to-pointer, and we assign like:

  *obj = parse_object_buffer(...);

it's not correct to check:

  if (!obj)

That will always be true, since our double pointer will continue to
point to the single pointer (which is itself NULL). This is a regression
that was introduced by aa46a0da30 (ref-filter: use oid_object_info() to
get object, 2018-07-17); since that commit we'll segfault on a parse
failure, as we try to look at the NULL object pointer.

There are many ways a parse could fail, but most of them are hard to set
up in the tests (it's easy to make a bogus object, but update-ref will
refuse to point to it). The test here uses a tag which points to a wrong
object type. A parse of just the broken tag object will succeed, but
seeing both tag objects in the same process will lead to a parse error
(since we'll see the pointed-to object as both types).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 12:54:21 -07:00
3f267a1128 builtin/pack-objects.c: respect 'pack.preferBitmapTips'
When writing a new pack with a bitmap, it is sometimes convenient to
indicate some reference prefixes which should receive priority when
selecting which commits to receive bitmaps.

A truly motivated caller could accomplish this by setting
'pack.islandCore', (since all commits in the core island are similarly
marked as preferred) but this requires callers to opt into using delta
islands, which they may or may not want to do.

Introduce a new multi-valued configuration, 'pack.preferBitmapTips' to
allow callers to specify a list of reference prefixes. All references
which have a prefix contained in 'pack.preferBitmapTips' will mark their
tips as "preferred" in the same way as commits are marked as preferred
for selection by 'pack.islandCore'.

The choice of the verb "prefer" is intentional: marking the NEEDS_BITMAP
flag on an object does *not* guarantee that that object will receive a
bitmap. It merely guarantees that that commit will receive a bitmap over
any *other* commit in the same window by bitmap_writer_select_commits().

The test this patch adds reflects this quirk, too. It only tests that
a commit (which didn't receive bitmaps by default) is selected for
bitmaps after changing the value of 'pack.preferBitmapTips' to include
it. Other commits may lose their bitmaps as a byproduct of how the
selection process works (bitmap_writer_select_commits() ignores the
remainder of a window after seeing a commit with the NEEDS_BITMAP flag).

This configuration will aide in selecting important references for
multi-pack bitmaps, since they do not respect the same pack.islandCore
configuration. (They could, but doing so may be confusing, since it is
packs--not bitmaps--which are influenced by the delta-islands
configuration).

In a fork network repository (one which lists all forks of a given
repository as remotes), for example, it is useful to set
pack.preferBitmapTips to 'refs/remotes/<root>/heads' and
'refs/remotes/<root>/tags', where '<root>' is an opaque identifier
referring to the repository which is at the base of the fork chain.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 23:14:03 -07:00
483fa7f42d t/helper/test-bitmap.c: initial commit
Add a new 'bitmap' test-tool which can be used to list the commits that
have received bitmaps.

In theory, a determined tester could run 'git rev-list --test-bitmap
<commit>' to check if '<commit>' received a bitmap or not, since
'--test-bitmap' exits with a non-zero code when it can't find the
requested commit.

But this is a dubious behavior to rely on, since arguably 'git
rev-list' could continue its object walk outside of which commits are
covered by bitmaps.

This will be used to test the behavior of 'pack.preferBitmapTips', which
will be added in the following patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 23:14:03 -07:00
dff5e49e51 pack-bitmap: add 'test_bitmap_commits()' helper
The next patch will add a 'bitmap' test-tool which prints the list of
commits that have bitmaps computed.

The test helper could implement this itself, but it would need access to
the 'bitmaps' field of the 'pack_bitmap' struct. To avoid exposing this
private detail, implement the entirety of the helper behind a
test_bitmap_commits() function in pack-bitmap.c.

There is some precedence for this with test_bitmap_walk() which is used
to implement the '--test-bitmap' flag in 'git rev-list' (and is also
implemented in pack-bitmap.c).

A caller will be added in the next patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 23:14:03 -07:00
39edfd5cbc sequencer: fix edit handling for cherry-pick and revert messages
save_opts() should save any non-default values.  It was intended to do
this, but since most options in struct replay_opts default to 0, it only
saved non-zero values.  Unfortunately, this does not always work for
options.edit.  Roughly speaking, options.edit had a default value of 0
for cherry-pick but a default value of 1 for revert.  Make save_opts()
record a value whenever it differs from the default.

options.edit was also overly simplistic; we had more than two cases.
The behavior that previously existed was as follows:

                       Non-conflict commits    Right after Conflict
    revert             Edit iff isatty(0)      Edit (ignore isatty(0))
    cherry-pick        No edit                 See above
    Specify --edit     Edit (ignore isatty(0)) See above
    Specify --no-edit  (*)                     See above

    (*) Before stopping for conflicts, No edit is the behavior.  After
        stopping for conflicts, the --no-edit flag is not saved so see
        the first two rows.

However, the expected behavior is:

                       Non-conflict commits    Right after Conflict
    revert             Edit iff isatty(0)      Edit iff isatty(0)
    cherry-pick        No edit                 Edit iff isatty(0)
    Specify --edit     Edit (ignore isatty(0)) Edit (ignore isatty(0))
    Specify --no-edit  No edit                 No edit

In order to get the expected behavior, we need to change options.edit
to a tri-state: unspecified, false, or true.  When specified, we follow
what it says.  When unspecified, we need to check whether the current
commit being created is resolving a conflict as well as consulting
options.action and isatty(0).  While at it, add a should_edit() utility
function that compresses options.edit down to a boolean based on the
additional information for the non-conflict case.

continue_single_pick() is the function responsible for resuming after
conflict cases, regardless of whether there is one commit being picked
or many.  Make this function stop assuming edit behavior in all cases,
so that it can correctly handle !isatty(0) and specific requests to not
edit the commit message.

Reported-by: Renato Botelho <garga@freebsd.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 14:10:50 -07:00
a65ce7f831 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 14:35:38 -07:00
5c2f7ff018 Merge branch 'jc/doc-format-patch-clarify'
Explain pieces of the format-patch output upfront before the rest
of the documentation starts referring to them.

* jc/doc-format-patch-clarify:
  format-patch: give an overview of what a "patch" message is
2021-03-30 14:35:38 -07:00
7652ce966f Merge branch 'ab/detox-gettext-tests'
Testfix.

* ab/detox-gettext-tests:
  mktag tests: fix broken "&&" chain
2021-03-30 14:35:38 -07:00
4730c5e273 Merge branch 'hx/pack-objects-chunk-comment'
Comment update.

* hx/pack-objects-chunk-comment:
  pack-objects: fix comment of reused_chunk.difference
2021-03-30 14:35:37 -07:00
1ba947cf15 Merge branch 'rf/send-email-hookspath'
"git send-email" learned to honor the core.hooksPath configuration.

* rf/send-email-hookspath:
  git-send-email: Respect core.hooksPath setting
2021-03-30 14:35:37 -07:00
dc2a073036 Merge branch 'ab/remove-rebase-usebuiltin'
Remove the final hint that we used to have a scripted "git rebase".

* ab/remove-rebase-usebuiltin:
  rebase: remove transitory rebase.useBuiltin setting & env
2021-03-30 14:35:37 -07:00
5013802862 Merge branch 'cs/http-use-basic-after-failed-negotiate'
When accessing a server with a URL like https://user:pass@site/, we
did not to fall back to the basic authentication with the
credential material embedded in the URL after the "Negotiate"
authentication failed.  Now we do.

* cs/http-use-basic-after-failed-negotiate:
  remote-curl: fall back to basic auth if Negotiate fails
2021-03-30 14:35:37 -07:00
b2309ad822 Merge branch 'ab/diff-no-index-tests'
More test coverage over "diff --no-index".

* ab/diff-no-index-tests:
  diff --no-index tests: test mode normalization
  diff --no-index tests: add test for --exit-code
2021-03-30 14:35:37 -07:00
ad16f748f2 Merge branch 'ab/read-tree'
Code simplification by removing support for a caller that is long gone.

* ab/read-tree:
  tree.h API: simplify read_tree_recursive() signature
  tree.h API: expose read_tree_1() as read_tree_at()
  archive: stop passing "stage" through read_tree_recursive()
  ls-files: refactor away read_tree()
  ls-files: don't needlessly pass around stage variable
  tree.c API: move read_tree() into builtin/ls-files.c
  ls-files tests: add meaningful --with-tree tests
  show tests: add test for "git show <tree>"
2021-03-30 14:35:37 -07:00
aab55b1d6e Merge branch 'bs/asciidoctor-installation-hints'
Doc update.

* bs/asciidoctor-installation-hints:
  INSTALL: note on using Asciidoctor to build doc
2021-03-30 14:35:36 -07:00
9210c68d2a Merge branch 'mt/checkout-remove-nofollow'
When "git checkout" removes a path that does not exist in the
commit it is checking out, it wasn't careful enough not to follow
symbolic links, which has been corrected.

* mt/checkout-remove-nofollow:
  checkout: don't follow symlinks when removing entries
  symlinks: update comment on threaded_check_leading_path()
2021-03-30 14:35:36 -07:00
c9e40ae8ec p2000: add sparse-index repos
p2000-sparse-operations.sh compares different Git commands in
repositories with many files at HEAD but using sparse-checkout to focus
on a small portion of those files.

Add extra copies of the repository that use the sparse-index format so
we can track how that affects the performance of different commands.

At this point in time, the sparse-index is 100% overhead from the CPU
front, and this is measurable in these tests:

Test
---------------------------------------------------------------
2000.2: git status (full-index-v3)              0.59(0.51+0.12)
2000.3: git status (full-index-v4)              0.59(0.52+0.11)
2000.4: git status (sparse-index-v3)            1.40(1.32+0.12)
2000.5: git status (sparse-index-v4)            1.41(1.36+0.08)
2000.6: git add -A (full-index-v3)              2.32(1.97+0.19)
2000.7: git add -A (full-index-v4)              2.17(1.92+0.14)
2000.8: git add -A (sparse-index-v3)            2.31(2.21+0.15)
2000.9: git add -A (sparse-index-v4)            2.30(2.20+0.13)
2000.10: git add . (full-index-v3)              2.39(2.02+0.20)
2000.11: git add . (full-index-v4)              2.20(1.94+0.16)
2000.12: git add . (sparse-index-v3)            2.36(2.27+0.12)
2000.13: git add . (sparse-index-v4)            2.33(2.21+0.16)
2000.14: git commit -a -m A (full-index-v3)     2.47(2.12+0.20)
2000.15: git commit -a -m A (full-index-v4)     2.26(2.00+0.17)
2000.16: git commit -a -m A (sparse-index-v3)   3.01(2.92+0.16)
2000.17: git commit -a -m A (sparse-index-v4)   3.01(2.94+0.15)

Note that there is very little difference between the v3 and v4 index
formats when the sparse-index is enabled. This is primarily due to the
fact that the relative file sizes are the same, and the command time is
mostly taken up by parsing tree objects to expand the sparse index into
a full one.

With the current file layout, the index file sizes are given by this
table:

       |  full index | sparse index |
       +-------------+--------------+
    v3 |     108 MiB |      1.6 MiB |
    v4 |      80 MiB |      1.2 MiB |

Future updates will improve the performance of Git commands when the
index is sparse.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:49 -07:00
9ad2d5ea71 sparse-index: loose integration with cache_tree_verify()
The cache_tree_verify() method is run when GIT_TEST_CHECK_CACHE_TREE
is enabled, which it is by default in the test suite. The logic must
be adjusted for the presence of these directory entries.

For now, leave the test as a simple check for whether the directory
entry is sparse. Do not go any further until needed.

This allows us to re-enable GIT_TEST_CHECK_CACHE_TREE in
t1092-sparse-checkout-compatibility.sh. Further,
p2000-sparse-operations.sh uses the test suite and hence this is enabled
for all tests. We need to integrate with it before we run our
performance tests with a sparse-index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
2de37c536d cache-tree: integrate with sparse directory entries
The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.

When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
dcc5fd5fd2 sparse-checkout: disable sparse-index
We use 'git sparse-checkout init --cone --sparse-index' to toggle the
sparse-index feature. It makes sense to also disable it when running
'git sparse-checkout disable'. This is particularly important because it
removes the extensions.sparseIndex config option, allowing other tools
to use this Git repository again.

This does mean that 'git sparse-checkout init' will not re-enable the
sparse-index feature, even if it was previously enabled.

While testing this feature, I noticed that the sparse-index was not
being written on the first run, but by a second. This was caught by the
call to 'test-tool read-cache --table'. This requires adjusting some
assignments to core_apply_sparse_checkout and pl.use_cone_patterns in
the sparse_checkout_init() logic.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
122ba1f7b5 sparse-checkout: toggle sparse index from builtin
The sparse index extension is used to signal that index writes should be
in sparse mode. This was only updated using GIT_TEST_SPARSE_INDEX=1.

Add a '--[no-]sparse-index' option to 'git sparse-checkout init' that
specifies if the sparse index should be used. It also updates the index
to use the correct format, either way. Add a warning in the
documentation that the use of a repository extension might reduce
compatibility with third-party tools. 'git sparse-checkout init' already
sets extension.worktreeConfig, which places most sparse-checkout users
outside of the scope of most third-party tools.

Update t1092-sparse-checkout-compatibility.sh to use this CLI instead of
GIT_TEST_SPARSE_INDEX=1.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
58300f4743 sparse-index: add index.sparse config option
When enabled, this config option signals that index writes should
attempt to use sparse-directory entries.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
0938e6ff55 sparse-index: check index conversion happens
Add a test case that uses test_region to ensure that we are truly
expanding a sparse index to a full one, then converting back to sparse
when writing the index. As we integrate more Git commands with the
sparse index, we will convert these commands to check that we do _not_
convert the sparse index to a full index and instead stay sparse the
entire time.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
13e1331247 unpack-trees: allow sparse directories
The index_pos_by_traverse_info() currently throws a BUG() when a
directory entry exists exactly in the index. We need to consider that it
is possible to have a directory in a sparse index as long as that entry
is itself marked with the skip-worktree bit.

The 'pos' variable is assigned a negative value if an exact match is not
found. Since a directory name can be an exact match, it is no longer an
error to have a nonnegative 'pos' value.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
f442313e2e submodule: sparse-index should not collapse links
A submodule is stored as a "Git link" that actually points to a commit
within a submodule. Submodules are populated or not depending on
submodule configuration, not sparse-checkout. To ensure that the
sparse-index feature integrates correctly with submodules, we should not
collapse a directory if there is a Git link within its range.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
6e773527b6 sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.

For now, we avoid converting the index to a sparse index if:

 1. the index is split.
 2. the index is already sparse.
 3. sparse-checkout is disabled.
 4. sparse-checkout does not use cone mode.

Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.

The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.

The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.

Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."

We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.

The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
cd42415fb4 sparse-index: add 'sdir' index extension
The index format does not currently allow for sparse directory entries.
This violates some expectations that older versions of Git or
third-party tools might not understand. We need an indicator inside the
index file to warn these tools to not interact with a sparse index
unless they are aware of sparse directory entries.

Add a new _required_ index extension, 'sdir', that indicates that the
index may contain sparse directory entries. This allows us to continue
to use the differences in index formats 2, 3, and 4 before we create a
new index version 5 in a later change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
836e25c51b sparse-checkout: hold pattern list in index
As we modify the sparse-checkout definition, we perform index operations
on a pattern_list that only exists in-memory. This allows easy backing
out in case the index update fails.

However, if the index write itself cares about the sparse-checkout
pattern set, we need access to that in-memory copy. Place a pointer to
a 'struct pattern_list' in the index so we can access this on-demand.
This will be used in the next change which uses the sparse-checkout
definition to filter out directories that are outside the sparse cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
6863df3550 unpack-trees: ensure full index
The next change will translate full indexes into sparse indexes at write
time. The existing logic provides a way for every sparse index to be
expanded to a full index at read time. However, there are cases where an
index is written and then continues to be used in-memory to perform
further updates.

unpack_trees() is frequently called after such a write. In particular,
commands like 'git reset' do this double-update of the index.

Ensure that we have a full index when entering unpack_trees(), but only
when command_requires_full_index is true. This is always true at the
moment, but we will later relax that after unpack_trees() is updated to
handle sparse directory entries.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
2782db3eed test-tool: don't force full index
We will use 'test-tool read-cache --table' to check that a sparse
index is written as part of init_repos. Since we will no longer always
expand a sparse index into a full index, add an '--expand' parameter
that adds a call to ensure_full_index() so we can compare a sparse index
directly against a full index, or at least what the in-memory index
looks like when expanded in this way.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
e2df6c3972 test-read-cache: print cache entries with --table
This table is helpful for discovering data in the index to ensure it is
being written correctly, especially as we build and test the
sparse-index. This table includes an output format similar to 'git
ls-tree', but should not be compared to that directly. The biggest
reasons are that 'git ls-tree' includes a tree entry for every
subdirectory, even those that would not appear as a sparse directory in
a sparse-index. Further, 'git ls-tree' does not use a trailing directory
separator for its tree rows.

This does not print the stat() information for the blobs. That will be
added in a future change with another option. The tests that are added
in the next few changes care only about the object types and IDs.
However, this future need for full index information justifies the need
for this test helper over extending a user-facing feature, such as 'git
ls-files'.

To make the option parsing slightly more robust, wrap the string
comparisons in a loop adapted from test-dir-iterator.c.

Care must be taken with the final check for the 'cnt' variable. We
continue the expectation that the numerical value is the final argument.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
ecfc47c066 t1092: compare sparse-checkout to sparse-index
Add a new 'sparse-index' repo alongside the 'full-checkout' and
'sparse-checkout' repos in t1092-sparse-checkout-compatibility.sh. Also
add run_on_sparse and test_sparse_match helpers. These helpers will be
used when the sparse index is implemented.

Add the GIT_TEST_SPARSE_INDEX environment variable to enable the
sparse-index by default. This can be enabled across all tests, but that
will only affect cases where the sparse-checkout feature is enabled.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:45 -07:00
4300f8442a sparse-index: implement ensure_full_index()
We will mark an in-memory index_state as having sparse directory entries
with the sparse_index bit. These currently cannot exist, but we will add
a mechanism for collapsing a full index to a sparse one in a later
change. That will happen at write time, so we must first allow parsing
the format before writing it.

Commands or methods that require a full index in order to operate can
call ensure_full_index() to expand that index in-memory. This requires
parsing trees using that index's repository.

Sparse directory entries have a specific 'ce_mode' value. The macro
S_ISSPARSEDIR(ce->ce_mode) can check if a cache_entry 'ce' has this type.
This ce_mode is not possible with the existing index formats, so we don't
also verify all properties of a sparse-directory entry, which are:

 1. ce->ce_mode == 0040000
 2. ce->flags & CE_SKIP_WORKTREE is true
 3. ce->name[ce->namelen - 1] == '/' (ends in dir separator)
 4. ce->oid references a tree object.

These are all semi-enforced in ensure_full_index() to some extent. Any
deviation will cause a warning at minimum or a failure in the worst
case.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:45 -07:00
3964fc2aae sparse-index: add guard to ensure full index
Upcoming changes will introduce modifications to the index format that
allow sparse directories. It will be useful to have a mechanism for
converting those sparse index files into full indexes by walking the
tree at those sparse directories. Name this method ensure_full_index()
as it will guarantee that the index is fully expanded.

This method is not implemented yet, and instead we focus on the
scaffolding to declare it and call it at the appropriate time.

Add a 'command_requires_full_index' member to struct repo_settings. This
will be an indicator that we need the index in full mode to do certain
index operations. This starts as being true for every command, then we
will set it to false as some commands integrate with sparse indexes.

If 'command_requires_full_index' is true, then we will immediately
expand a sparse index to a full one upon reading from disk. This
suffices for now, but we will want to add more callers to
ensure_full_index() later.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:45 -07:00
4b3f765a2f t1092: clean up script quoting
This test was introduced in 19a0acc83e (t1092: test interesting
sparse-checkout scenarios, 2021-01-23), but it contains issues with quoting
that were not noticed until starting this follow-up series. The old
mechanism would drop quoting such as in

   test_all_match git commit -m "touch README.md"

The above happened to work because README.md is a file in the
repository, so 'git commit -m touch REAMDE.md' would succeed by
accident.

Other cases included quoting for no good reason, so clean that up now.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:45 -07:00
0b5fcb08b5 t/perf: add performance test for sparse operations
Create a test script that takes the default performance test (the Git
codebase) and multiplies it by 256 using four layers of duplicated
trees of width four. This results in nearly one million blob entries in
the index. Then, we can clone this repository with sparse-checkout
patterns that demonstrate four copies of the initial repository. Each
clone will use a different index format or mode so peformance can be
tested across the different options.

Note that the initial repo is stripped of submodules before doing the
copies. This preserves the expected data shape of the sparse index,
because directories containing submodules are not collapsed to a sparse
directory entry.

Run a few Git commands on these clones, especially those that use the
index (status, add, commit).

Here are the results on my Linux machine:

Test
--------------------------------------------------------------
2000.2: git status (full-index-v3)             0.37(0.30+0.09)
2000.3: git status (full-index-v4)             0.39(0.32+0.10)
2000.4: git add -A (full-index-v3)             1.42(1.06+0.20)
2000.5: git add -A (full-index-v4)             1.26(0.98+0.16)
2000.6: git add . (full-index-v3)              1.40(1.04+0.18)
2000.7: git add . (full-index-v4)              1.26(0.98+0.17)
2000.8: git commit -a -m A (full-index-v3)     1.42(1.11+0.16)
2000.9: git commit -a -m A (full-index-v4)     1.33(1.08+0.16)

It is perhaps noteworthy that there is an improvement when using index
version 4. This is because the v3 index uses 108 MiB while the v4
index uses 80 MiB. Since the repeated portions of the directories are
very short (f3/f1/f2, for example) this ratio is less pronounced than in
similarly-sized real repositories.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:44 -07:00
0ad6090bdd sparse-index: design doc and format update
This begins a long effort to update the index format to allow sparse
directory entries. This should result in a significant improvement to
Git commands when HEAD contains millions of files, but the user has
selected many fewer files to keep in their sparse-checkout definition.

Currently, the index format is only updated in the presence of
extensions.sparseIndex instead of increasing a file format version
number. This is temporary, and index v5 is part of the plan for future
work in this area.

The design document details many of the reasons for embarking on this
work, and also the plan for completing it safely.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:44 -07:00
86d174b724 t/helper/test-read-midx.c: add '--show-objects'
The 'read-midx' helper is used in places like t5319 to display basic
information about a multi-pack-index.

In the next patch, the MIDX writing machinery will learn a new way to
choose from which pack an object is selected when multiple copies of
that object exist.

To disambiguate which pack introduces an object so that this feature can
be tested, add a '--show-objects' option which displays additional
information about each object in the MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
cd57bc41bb builtin/multi-pack-index.c: display usage on unrecognized command
When given a sub-command that it doesn't understand, 'git
multi-pack-index' dies with the following message:

    $ git multi-pack-index bogus
    fatal: unrecognized subcommand: bogus

Instead of 'die()'-ing, we can display the usage text, which is much
more helpful:

    $ git.compile multi-pack-index bogus
    error: unrecognized subcommand: bogus
    usage: git multi-pack-index [<options>] write
       or: git multi-pack-index [<options>] verify
       or: git multi-pack-index [<options>] expire
       or: git multi-pack-index [<options>] repack [--batch-size=<size>]

        --object-dir <file>   object directory containing set of packfile and pack-index pairs
        --progress            force progress reporting

While we're at it, clean up some duplication between the "no sub-command"
and "unrecognized sub-command" conditionals.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
690eb05719 builtin/multi-pack-index.c: don't enter bogus cmd_mode
Even before the recent refactoring, 'git multi-pack-index' calls
'trace2_cmd_mode()' before verifying that the sub-command is recognized.

Push this call down into the individual sub-commands so that we don't
enter a bogus command mode.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
60ca94769c builtin/multi-pack-index.c: split sub-commands
Handle sub-commands of the 'git multi-pack-index' builtin (e.g.,
"write", "repack", etc.) separately from one another. This allows
sub-commands with unique options, without forcing cmd_multi_pack_index()
to reject invalid combinations itself.

This comes at the cost of some duplication and boilerplate. Luckily, the
duplication is reduced to a minimum, since common options are shared
among sub-commands due to a suggestion by Ævar. (Sub-commands do have to
retain the common options, too, since this builtin accepts common
options on either side of the sub-command).

Roughly speaking, cmd_multi_pack_index() parses options (including
common ones), and stops at the first non-option, which is the
sub-command. It then dispatches to the appropriate sub-command, which
parses the remaining options (also including common options).

Unknown options are kept by the sub-commands in order to detect their
presence (and complain that too many arguments were given).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
b25b727494 builtin/multi-pack-index.c: define common usage with a macro
Factor out the usage message into pieces corresponding to each mode.
This avoids options specific to one sub-command from being shared with
another in the usage.

A subsequent commit will use these #define macros to have usage
variables for each sub-command without duplicating their contents.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
cf1f5389ec builtin/multi-pack-index.c: don't handle 'progress' separately
Now that there is a shared 'flags' member in the options structure,
there is no need to keep track of whether to force progress or not,
since ultimately the decision of whether or not to show a progress meter
is controlled by a bit in the flags member.

Manipulate that bit directly, and drop the now-unnecessary 'progress'
field while we're at it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
f7c4d63e35 builtin/multi-pack-index.c: inline 'flags' with options
Subcommands of the 'git multi-pack-index' command (e.g., 'write',
'verify', etc.) will want to optionally change a set of shared flags
that are eventually passed to the MIDX libraries.

Right now, options and flags are handled separately. That's fine, since
the options structure is never passed around. But a future patch will
make it so that common options shared by all sub-commands are defined in
a common location. That means that "flags" would have to become a global
variable.

Group it with the options structure so that we reduce the number of
global variables we have overall.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
5ee90326dc column, range-diff: downcase option description
It is customary not to begin the help text for each option given to
the parse-options API with a capital letter. Various (sub)commands'
option arrays don't follow the guideline provided by the parse_options
Documentation regarding the descriptions.

Downcase the first word of some option descriptions for "column"
and "range-diff".

Signed-off-by: Chinmoy Chakraborty <chinmoy12c@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-29 14:06:08 -07:00
958a5f5dfe cmake(install): include vcpkg dlls
Our CMake configuration generates not only build definitions, but also
install definitions: After building Git using `msbuild git.sln`, the
built artifacts can be installed via `msbuild INSTALL.vcxproj`.

To specify _where_ the files should be installed, the
`-DCMAKE_INSTALL_PREFIX=<path>` option can be used when running CMake.

However, this process would really only install the files that were just
built. On Windows, we need more than that: We also need the `.dll` files
of the dependencies (such as libcurl). The `vcpkg` ecosystem, which we
use to obtain those dependencies, can be asked to install said `.dll`
files really easily, so let's do that.

This requires more than just the built `vcpkg` artifacts in the CI build
definition; We now clone the `vcpkg` repository so that the relevant
CMake scripts are available, in particular the ones related to defining
the toolchain.

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-29 13:49:04 -07:00
e8772a7af5 cmake: add a preparatory work-around to accommodate vcpkg
We are about to add support for installing the `.dll` files of Git's
dependencies (such as libcurl) in the CMake configuration. The `vcpkg`
ecosystem from which we get said dependencies makes that relatively
easy: simply turn on `X_VCPKG_APPLOCAL_DEPS_INSTALL`.

However, current `vcpkg` introduces a limitation if one does that:
While it is totally cool with CMake to specify multiple targets within
one invocation of `install(TARGETS ...) (at least according to
https://cmake.org/cmake/help/latest/command/install.html#command:install),
`vcpkg`'s parser insists on a single target per `install(TARGETS ...)`
invocation.

Well, that's easily accomplished: Let's feed the targets individually to
the `install(TARGETS ...)` function in a `foreach()` look.

This also has the advantage that we do not have to manually cull off the
two entries from the `${PROGRAMS_BUILT}` array before scheduling the
remainder to be installed into `libexec/git-core`. Instead, we iterate
through the array and decide for each entry where it wants to go.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-29 13:49:04 -07:00
3745e2693d fetch-pack: use new fsck API to printing dangling submodules
Refactor the check added in 5476e1efde (fetch-pack: print and use
dangling .gitmodules, 2021-02-22) to make use of us now passing the
"msg_id" to the user defined "error_func". We can now compare against
the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated
message.

Let's also replace register_found_gitmodules() with directly
manipulating the "gitmodules_found" member. A recent commit moved it
into "fsck_options" so we could do this here.

I'm sticking this callback in fsck.c. Perhaps in the future we'd like
to accumulate such callbacks into another file (maybe fsck-cb.c,
similar to parse-options-cb.c?), but while we've got just the one
let's just put it into fsck.c.

A better alternative in this case would be some library some more
obvious library shared by fetch-pack.c ad builtin/index-pack.c, but
there isn't such a thing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
c96e184cae fetch-pack: use file-scope static struct for fsck_options
Change code added in 5476e1efde (fetch-pack: print and use dangling
.gitmodules, 2021-02-22) so that we use a file-scoped "static struct
fsck_options" instead of defining one in the "fsck_gitmodules_oids()"
function.

We use this pattern in all of
builtin/{fsck,index-pack,mktag,unpack-objects}.c. It's odd to see
fetch-pack be the odd one out. One might think that we're using other
fsck_options structs in fetch-pack, or doing on fsck twice there, but
we're not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
462f5cae0f fetch-pack: don't needlessly copy fsck_options
Change the behavior of the .gitmodules validation added in
5476e1efde (fetch-pack: print and use dangling .gitmodules,
2021-02-22) so we're using one "fsck_options".

I found that code confusing to read. One might think that not setting
up the error_func earlier means that we're relying on the "error_func"
not being set in some code in between the two hunks being modified
here.

But we're not, all we're doing in the rest of "cmd_index_pack()" is
further setup by calling fsck_set_msg_types(), and assigning to
do_fsck_object.

So there was no reason in 5476e1efde to make a shallow copy of the
fsck_options struct before setting error_func. Let's just do this
setup at the top of the function, along with the "walk" assignment.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
c15087d17b fsck.c: move gitmodules_{found,done} into fsck_options
Move the gitmodules_{found,done} static variables added in
159e7b080b (fsck: detect gitmodules files, 2018-05-02) into the
fsck_options struct. It makes sense to keep all the context in the
same place.

This requires changing the recently added register_found_gitmodules()
function added in 5476e1efde (fetch-pack: print and use dangling
.gitmodules, 2021-02-22) to take fsck_options. That function will be
removed in a subsequent commit, but as it'll require the new
gitmodules_found attribute of "fsck_options" we need this intermediate
step first.

An earlier version of this patch removed the small amount of
duplication we now have between FSCK_OPTIONS_{DEFAULT,STRICT} with a
FSCK_OPTIONS_COMMON macro. I don't think such de-duplication is worth
it for this amount of copy/pasting.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
53692df2b8 fsck.c: add an fsck_set_msg_type() API that takes enums
Change code I added in acf9de4c94 (mktag: use fsck instead of custom
verify_tag(), 2021-01-05) to make use of a new API function that takes
the fsck_msg_{id,type} types, instead of arbitrary strings that
we'll (hopefully) parse into those types.

At the time that the fsck_set_msg_type() API was introduced in
0282f4dced (fsck: offer a function to demote fsck errors to warnings,
2015-06-22) it was only intended to be used to parse user-supplied
data.

For things that are purely internal to the C code it makes sense to
have the compiler check these arguments, and to skip the sanity
checking of the data in fsck_set_msg_type() which is redundant to
checks we get from the compiler.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
394d5d31b0 fsck.c: pass along the fsck_msg_id in the fsck_error callback
Change the fsck_error callback to also pass along the
fsck_msg_id. Before this change the only way to get the message id was
to parse it back out of the "message".

Let's pass it down explicitly for the benefit of callers that might
want to use it, as discussed in [1].

Passing the msg_type is now redundant, as you can always get it back
from the msg_id, but I'm not changing that convention. It's really
common to need the msg_type, and the report() function itself (which
calls "fsck_error") needs to call fsck_msg_type() to discover
it. Let's not needlessly re-do that work in the user callback.

1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
44e07da8bb fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h
Move the FOREACH_FSCK_MSG_ID macro and the fsck_msg_id enum it helps
define from fsck.c to fsck.h. This is in preparation for having
non-static functions take the fsck_msg_id as an argument.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
901f2f6742 fsck.c: give "FOREACH_MSG_ID" a more specific name
Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation
for moving it over to fsck.h. It's good convention to name macros
in *.h files in such a way as to clearly not clash with any other
names in other files.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
b5495024ec fsck.c: undefine temporary STR macro after use
In f417eed8cd (fsck: provide a function to parse fsck message IDs,
2015-06-22) the "STR" macro was introduced, but that short macro name
was not undefined after use as was done earlier in the same series for
the MSG_ID macro in c99ba492f1 (fsck: introduce identifiers for fsck
messages, 2015-06-22).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
c72da1a22b fsck.c: call parse_msg_type() early in fsck_set_msg_type()
There's no reason to defer the calling of parse_msg_type() until after
we've checked if the "id < 0". This is not a hot codepath, and
parse_msg_type() itself may die on invalid input.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
30cf618eef fsck.h: re-order and re-assign "enum fsck_msg_type"
Change the values in the "enum fsck_msg_type" from being manually
assigned to using default C enum values.

This means we end up with a FSCK_IGNORE=0, which was previously
defined as "2".

I'm confident that nothing relies on these values, we always compare
them for equality. Let's not omit "0" so it won't be assumed that
we're using these as a boolean somewhere.

This also allows us to re-structure the fields to mark which are
"private" v.s. "public". See the preceding commit for a rationale for
not simply splitting these into two enums, namely that this is used
for both the private and public fsck API.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
1b32b59f9b fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum
Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new
fsck_msg_type enum.

These defines were originally introduced in:

 - ba002f3b28 (builtin-fsck: move common object checking code to
   fsck.c, 2008-02-25)
 - f50c440730 (fsck: disallow demoting grave fsck errors to warnings,
   2015-06-22)
 - efaba7cc77 (fsck: optionally ignore specific fsck issues
   completely, 2015-06-22)
 - f27d05b170 (fsck: allow upgrading fsck warnings to errors,
   2015-06-22)

The reason these were defined in two different places is because we
use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are
used by external callbacks.

Untangling that would take some more work, since we expose the new
"enum fsck_msg_type" to both. Similar to "enum object_type" it's not
worth structuring the API in such a way that only those who need
FSCK_{ERROR,WARN} pass around a different type.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
e35d65a78a fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type"
Refactor "if options->msg_type" and other code added in
0282f4dced (fsck: offer a function to demote fsck errors to warnings,
2015-06-22) to reduce the scope of the "int msg_type" variable.

This is in preparation for changing its type in a subsequent commit,
only using it in the "!options->msg_type" scope makes that change

This also brings the code in line with the fsck_set_msg_type()
function (also added in 0282f4dced), which does a similar check for
"!options->msg_type". Another minor benefit is getting rid of the
style violation of not having braces for the body of the "if".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
35af754b06 fsck.c: rename remaining fsck_msg_id "id" to "msg_id"
Rename the remaining variables of type fsck_msg_id from "id" to
"msg_id". This change is relatively small, and is worth the churn for
a later change where we have different id's in the "report" function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
034a7b7bcc fsck.c: remove (mostly) redundant append_msg_id() function
Remove the append_msg_id() function in favor of calling
prepare_msg_ids(). We already have code to compute the camel-cased
msg_id strings in msg_id_info, let's use it.

When the append_msg_id() function was added in 71ab8fa840 (fsck:
report the ID of the error/warning, 2015-06-22) the prepare_msg_ids()
function didn't exist. When prepare_msg_ids() was added in
a46baac61e (fsck: factor out msg_id_info[] lazy initialization code,
2018-05-26) this code wasn't moved over to lazy initialization.

This changes the behavior of the code to initialize all the messages
instead of just camel-casing the one we need on the fly. Since the
common case is that we're printing just one message this is mostly
redundant work.

But that's OK in this case, reporting this fsck issue to the user
isn't performance-sensitive. If we were somehow doing so in a tight
loop (in a hopelessly broken repository?) this would help, since we'd
save ourselves from re-doing this work for identical messages, we
could just grab the prepared string from msg_id_info after the first
invocation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
f1abc2d0e1 fsck.c: rename variables in fsck_set_msg_type() for less confusion
Rename variables in a function added in 0282f4dced (fsck: offer a
function to demote fsck errors to warnings, 2015-06-22).

It was needlessly confusing that it took a "msg_type" argument, but
then later declared another "msg_type" of a different type.

Let's rename that to "severity", and rename "id" to "msg_id" and
"msg_id" to "msg_id_str" etc. This will make a follow-up change
smaller.

While I'm at it properly indent the fsck_set_msg_type() argument list.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
a1aad71601 fsck.h: use "enum object_type" instead of "int"
Change the fsck_walk_func to use an "enum object_type" instead of an
"int" type. The types are compatible, and ever since this was added in
355885d531 (add generic, type aware object chain walker, 2008-02-25)
we've used entries from object_type (OBJ_BLOB etc.).

So this doesn't really change anything as far as the generated code is
concerned, it just gives the compiler more information and makes this
easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
d385784f89 fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT}
Refactor the definitions of FSCK_OPTIONS_{DEFAULT,STRICT} to use
designated initializers. This allows us to omit those fields that
are initialized to 0 or NULL.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:02:59 -07:00
569f8d188f cmake(install): fix double .exe suffixes
By mistake, the `.exe` extension is appended _twice_ when installing the
dashed executables into `libexec/git-core/` on Windows (the extension is
already appended when adding items to the `git_links` list in the
`#Creating hardlinks` section).

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-27 18:02:23 -07:00
7bb544a4d1 cmake: support SKIP_DASHED_BUILT_INS
Just like the Makefile-based build learned to skip hard-linking the
dashed built-ins in 179227d6e2 (Optionally skip linking/copying the
built-ins, 2020-09-21), this patch teaches the CMake-based build the
same trick.

Note: In contrast to the Makefile-based process, the built-ins would
only be linked during installation, not already when Git is built.
Therefore, the CMake-based build that we use in our CI builds _already_
does not link those built-ins (because the files are not installed
anywhere, they are used to run the test suite in-place).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-27 18:02:23 -07:00
09420b7648 Document how we do embargoed releases
Whenever we fix critical vulnerabilities, we follow some sort of
protocol (e.g. setting a coordinated release date, keeping the fix under
embargo until that time, coordinating with packagers and/or hosting
sites, etc).

Similar in spirit to `Documentation/howto/maintain-git.txt`, let's
formalize the details in a document.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-27 15:13:12 -07:00
2e99b1e383 SECURITY: describe how to report vulnerabilities
In the same document, describe that Git does not have Long Term Support
(LTS) release trains, although security fixes are always applied to a
few of the most recent release trains.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-27 15:13:02 -07:00
9a7f1ce8b7 daemon: sanitize all directory separators
When sanitizing client-supplied strings on Windows, also strip off
backslashes, not just slashes.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-26 22:00:12 -07:00
84d06cdc06 Sync with v2.31.1 2021-03-26 14:59:47 -07:00
26c4f98ffd The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-26 14:59:03 -07:00
89519f662c Merge branch 'cm/rebase-i-fixup-amend-reword'
"git commit --fixup=<commit>", which was to tweak the changes made
to the contents while keeping the original log message intact,
learned "--fixup=(amend|reword):<commit>", that can be used to
tweak both the message and the contents, and only the message,
respectively.

* cm/rebase-i-fixup-amend-reword:
  doc/git-commit: add documentation for fixup=[amend|reword] options
  t3437: use --fixup with options to create amend! commit
  t7500: add tests for --fixup=[amend|reword] options
  commit: add a reword suboption to --fixup
  commit: add amend suboption to --fixup to create amend! commit
  sequencer: export and rename subject_length()
2021-03-26 14:59:03 -07:00
fde07fc356 Merge branch 'cm/rebase-i-updates'
Follow-up fixes to "cm/rebase-i" topic.

* cm/rebase-i-updates:
  doc/rebase -i: fix typo in the documentation of 'fixup' command
  t/t3437: fixup the test 'multiple fixup -c opens editor once'
  t/t3437: use named commits in the tests
  t/t3437: simplify and document the test helpers
  t/t3437: check the author date of fixed up commit
  t/t3437: remove the dependency of 'expected-message' file from tests
  t/t3437: fixup here-docs in the 'setup' test
  t/lib-rebase: update the documentation of FAKE_LINES
  rebase -i: clarify and fix 'fixup -c' rebase-todo help
  sequencer: rename a few functions
  sequencer: fixup the datatype of the 'flag' argument
2021-03-26 14:59:03 -07:00
ce4296cf2b Merge branch 'cm/rebase-i'
"rebase -i" is getting cleaned up and also enhanced.

* cm/rebase-i:
  doc/git-rebase: add documentation for fixup [-C|-c] options
  rebase -i: teach --autosquash to work with amend!
  t3437: test script for fixup [-C|-c] options in interactive rebase
  rebase -i: add fixup [-C | -c] command
  sequencer: use const variable for commit message comments
  sequencer: pass todo_item to do_pick_commit()
  rebase -i: comment out squash!/fixup! subjects from squash message
  sequencer: factor out code to append squash message
  rebase -i: only write fixup-message when it's needed
2021-03-26 14:59:03 -07:00
8c81fce4b0 Merge branch 'js/http-pki-credential-store'
The http codepath learned to let the credential layer to cache the
password used to unlock a certificate that has successfully been
used.

* js/http-pki-credential-store:
  http: drop the check for an empty proxy password before approving
  http: store credential when PKI auth is used
2021-03-26 14:59:02 -07:00
ed953e1076 Merge branch 'ab/make-cleanup'
Reorganize Makefile to allow building git.o and other essential
objects without extra stuff needed only for testing.

* ab/make-cleanup:
  Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets
  Makefile: split OBJECTS into OBJECTS and GIT_OBJS
  Makefile: sort OBJECTS assignment for subsequent change
  Makefile: split up long OBJECTS line
  Makefile: guard against TEST_OBJS in the environment
2021-03-26 14:59:02 -07:00
48bf2fa8ba Git 2.31.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-26 14:49:41 -07:00
ddaf1f62e3 csum-file: make hashwrite() more readable
The hashwrite() method takes an input buffer and updates a hashfile's
hash function while writing the data to a file. To avoid overuse of
flushes, the hashfile has an internal buffer and most writes will use
memcpy() to transfer data from the input 'buf' to the hashfile's buffer
of size 8 * 1024 bytes.

Logic introduced by a8032d12 (sha1write: don't copy full sized buffers,
2008-09-02) reduces the number of memcpy() calls when the input buffer
is sufficiently longer than the hashfile's buffer, causing nr to be the
length of the full buffer. In these cases, the input buffer is used
directly in chunks equal to the hashfile's buffer size.

This method caught my attention while investigating some performance
issues, but it turns out that these performance issues were noise within
the variance of the experiment.

However, during this investigation, I inspected hashwrite() and
misunderstood it, even after looking closely and trying to make it
faster. This change simply reorganizes some parts of the loop within
hashwrite() to make it clear that each batch either uses memcpy() to the
hashfile's buffer or writes directly from the input buffer. The previous
code relied on indirection through local variables and essentially
inlined the implementation of hashflush() to reduce lines of code.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-26 14:32:45 -07:00
9198c13e34 The third patch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 14:36:27 -07:00
858119f6d7 Merge branch 'nk/diff-index-fsmonitor'
"git diff-index" codepath has been taught to trust fsmonitor status
to reduce number of lstat() calls.

* nk/diff-index-fsmonitor:
  fsmonitor: add perf test for git diff HEAD
  fsmonitor: add assertion that fsmonitor is valid to check_removed
  fsmonitor: skip lstat deletion check during git diff-index
2021-03-24 14:36:27 -07:00
e537784f64 Merge branch 'jk/fail-prereq-testfix'
GIT_TEST_FAIL_PREREQS is a mechanism to skip test pieces with
prerequisites to catch broken tests that depend on the side effects
of optional pieces, but did not work at all when negative
prerequisites were involved.

* jk/fail-prereq-testfix:
  t: annotate !PTHREADS tests with !FAIL_PREREQS
2021-03-24 14:36:27 -07:00
2744383cbd Merge branch 'tb/geometric-repack'
"git repack" so far has been only capable of repacking everything
under the sun into a single pack (or split by size).  A cleverer
strategy to reduce the cost of repacking a repository has been
introduced.

* tb/geometric-repack:
  builtin/pack-objects.c: ignore missing links with --stdin-packs
  builtin/repack.c: reword comment around pack-objects flags
  builtin/repack.c: be more conservative with unsigned overflows
  builtin/repack.c: assign pack split later
  t7703: test --geometric repack with loose objects
  builtin/repack.c: do not repack single packs with --geometric
  builtin/repack.c: add '--geometric' option
  packfile: add kept-pack cache for find_kept_pack_entry()
  builtin/pack-objects.c: rewrite honor-pack-keep logic
  p5303: measure time to repack with keep
  p5303: add missing &&-chains
  builtin/pack-objects.c: add '--stdin-packs' option
  revision: learn '--no-kept-objects'
  packfile: introduce 'find_kept_pack_entry()'
2021-03-24 14:36:27 -07:00
c6617d1e4f Merge branch 'tb/push-simple-uses-branch-merge-config'
Doc update.

* tb/push-simple-uses-branch-merge-config:
  Documentation/git-push.txt: correct configuration typo
2021-03-24 14:36:27 -07:00
bf12013f1a pack-objects: fix comment of reused_chunk.difference
As record_reused_object(offset, offset - hashfile_total(out)) said,
reused_chunk.difference should be the offset of original packfile minus
the offset of the generated packfile. But the comment presented an opposite way.

Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 13:03:22 -07:00
28e29ee38b format-patch: give an overview of what a "patch" message is
The text says something called a "patch" is prepared one for each
commit, it is suitable for e-mail submission, and "am" is the
command to use it, but does not say what the "patch" really is.

The description in the page also refers to the "three-dash" line,
but it is unclear what it is, unless the reader is given a more
detailed overview of what the "patch" is.

Add a brief paragraph to give an overview of what the output looks
like.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 12:14:23 -07:00
6131807864 git-completion.bash: use __gitcomp_builtin() in _git_stash()
The completion for 'git stash' has not changed in a major way since it
was converted from shell script to builtin. Now that it's a builtin, we
can take advantage of the groundwork laid out by parse-options and use
the generated options.

Rewrite _git_stash() to take use __gitcomp_builtin() to generate
completions for subcommands.

The main `git stash` command does not take any arguments directly. If no
subcommand is given, it automatically defaults to `git stash push`. This
means that we can simplify the logic for when no subcommands have been
given yet. We only have to offer subcommand completions when we're
completing a non-option after "stash".

One area that this patch could improve upon is that the `git stash list`
command accepts log-options. It would be nice if the completion for this
were unified with that of _git_log() and _git_show() which would allow
completions to be provided for options such as `--pretty` but that is
outside the scope of this patch.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 10:05:47 -07:00
42b30bcbb7 git-completion.bash: extract from else in _git_stash()
To save a level of indentation, perform an early return in the "if" arm
so we can move the "else" code out of the block.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 10:05:47 -07:00
e94fb44042 git-completion.bash: pass $__git_subcommand_idx from __git_main()
Many completion functions perform hardcoded comparisons with $cword.
This fails in the case where the main git command is given arguments
(e.g. `git -C . bundle<TAB>` would fail to complete its subcommands).

Even _git_worktree(), which uses __git_find_on_cmdline(), could still
fail. With something like `git -C add worktree move<TAB>`, the
subcommand would be incorrectly identified as "add" instead of "move".

Assign $__git_subcommand_idx in __git_main(), where the git subcommand
is actually found and the corresponding completion function is called.
Use this variable to replace hardcoded comparisons with $cword.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 10:05:47 -07:00
76593c09bb mktag tests: fix broken "&&" chain
Remove a stray "xb" I inadvertently introduced in 780aa0a21e (tests:
remove last uses of GIT_TEST_GETTEXT_POISON=false, 2021-02-11).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 22:14:28 -07:00
c8243933c7 git-send-email: Respect core.hooksPath setting
get-send-email currently makes the assumption that the
'sendemail-validate' hook exists inside of the repository.

Since the introduction of 'core.hooksPath' configuration option in
867ad08a26 (hooks: allow customizing where the hook directory is,
2016-05-04), this is no longer true.

Instead of assuming a hardcoded repo relative path, query
git for the actual path of the hooks directory.

Signed-off-by: Robert Foss <robert.foss@linaro.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 15:02:52 -07:00
9bcde4d531 rebase: remove transitory rebase.useBuiltin setting & env
Remove the rebase.useBuiltin setting and the now-obsolete
GIT_TEST_REBASE_USE_BUILTIN test flag.

This was left in place after my d03ebd411c (rebase: remove the
rebase.useBuiltin setting, 2019-03-18) to help anyone who'd used the
experimental flag and wanted to know that it was the default, or that
they should transition their test environment to use the builtin
rebase unconditionally.

It's been more than long enough for those users to get a headsup about
this. So remove all the scaffolding that was left inplace after
d03ebd411c. I'm also removing the documentation entry, if anyone
still has this left in their configuration they can do some source
archaeology to figure out what it used to do, which makes more sense
than exposing every git user reading the documentation to this legacy
configuration switch.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 14:05:58 -07:00
db91988aa1 format-patch: allow a non-integral version numbers
The `-v<n>` option of `format-patch` can give nothing but an
integral iteration number to patches in a series.  Some people,
however, prefer to mark a new iteration with only a small fixup
with a non integral iteration number (e.g. an "oops, that was
wrong" fix-up patch for v4 iteration may be labeled as "v4.1").

Allow `format-patch` to take such a non-integral iteration
number.

`<n>` can be any string, such as '3.1' or '4rev2'. In the case
where it is a non-integral value, the "Range-diff" and "Interdiff"
headers will not include the previous version.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 12:49:47 -07:00
ae22751f9b entry: add checkout_entry_ca() taking preloaded conv_attrs
The parallel checkout machinery will call checkout_entry() for entries
that could not be written in parallel due to path collisions. At this
point, we will already be holding the conversion attributes for each
entry, and it would be wasteful to let checkout_entry() load these
again. Instead, let's add the checkout_entry_ca() variant, which
optionally takes a preloaded conv_attrs struct.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
30419e7e1d entry: move conv_attrs lookup up to checkout_entry()
In a following patch, checkout_entry() will use conv_attrs to decide
whether an entry should be enqueued for parallel checkout or not. But
the attributes lookup only happens lower in this call stack. To avoid
the unnecessary work of loading the attributes twice, let's move it up
to checkout_entry(), and pass the loaded struct down to write_entry().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
584a0d13f2 entry: extract update_ce_after_write() from write_entry()
The code that updates the in-memory index information after an entry is
written currently resides in write_entry(). Extract it to a public
function so that it can be called by the parallel checkout functions,
outside entry.c, in a later patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
49cfd9032a entry: make fstat_output() and read_blob_entry() public
These two functions will be used by the parallel checkout code, so let's
make them public. Note: fstat_output() is renamed to
fstat_checkout_output(), now that it has become public, seeking to avoid
future name collisions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
d052cc0382 entry: extract a header file for entry.c functions
The declarations of entry.c's public functions and structures currently
reside in cache.h. Although not many, they contribute to the size of
cache.h and, when changed, cause the unnecessary recompilation of
modules that don't really use these functions. So let's move them to a
new entry.h header. While at it let's also move a comment related to
checkout_entry() from entry.c to entry.h as it's more useful to describe
the function there.

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
2daae3d1d1 commit: add --trailer option
Historically, Git has supported the 'Signed-off-by' commit trailer
using the '--signoff' and the '-s' option from the command line.
But users may need to provide other trailer information from the
command line such as "Helped-by", "Reported-by", "Mentored-by",

Now implement a new `--trailer <token>[(=|:)<value>]` option to pass
other trailers to `interpret-trailers` and insert them into commit
messages.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:31:38 -07:00
1424303384 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 14:00:25 -07:00
3099d4faa3 Merge branch 'bc/clone-bare-with-conflicting-config'
"git -c core.bare=false clone --bare ..." would have segfaulted,
which has been corrected.

* bc/clone-bare-with-conflicting-config:
  builtin/init-db: handle bare clones when core.bare set to false
2021-03-22 14:00:25 -07:00
d4bda9b045 Merge branch 'jk/filter-branch-sha256'
Code clean-up.

* jk/filter-branch-sha256:
  filter-branch: drop $_x40 glob
  filter-branch: drop multiple-ancestor warning
  t7003: test ref rewriting explicitly
2021-03-22 14:00:25 -07:00
20adca9006 Merge branch 'ps/update-ref-trans-hook-doc'
Doc update.

* ps/update-ref-trans-hook-doc:
  githooks.txt: clarify documentation on reference-transaction hook
  githooks.txt: replace mentions of SHA-1 specific properties
2021-03-22 14:00:25 -07:00
960f466d1a Merge branch 'rr/mailmap-entry-self'
* rr/mailmap-entry-self:
  Add entry for Ramkumar Ramachandra
2021-03-22 14:00:25 -07:00
3d92c0a784 Merge branch 'jr/doc-ignore-typofix'
Doc cleanup.

* jr/doc-ignore-typofix:
  doc: .gitignore documentation typofix
2021-03-22 14:00:25 -07:00
44e03bfdb6 Merge branch 'sv/t9801-test-path-is-file-cleanup'
Test cleanup.

* sv/t9801-test-path-is-file-cleanup:
  t9801: replace test -f with test_path_is_file
2021-03-22 14:00:24 -07:00
c83d602ad2 Merge branch 'dl/cat-file-doc-cleanup'
Doc cleanup.

* dl/cat-file-doc-cleanup:
  git-cat-file.txt: remove references to "sha1"
  git-cat-file.txt: monospace args, placeholders and filenames
2021-03-22 14:00:24 -07:00
25f9326561 Merge branch 'rs/pretty-describe'
"git log --format='...'" learned "%(describe)" placeholder.

* rs/pretty-describe:
  archive: expand only a single %(describe) per archive
  pretty: document multiple %(describe) being inconsistent
  t4205: assert %(describe) test coverage
  pretty: add merge and exclude options to %(describe)
  pretty: add %(describe)
2021-03-22 14:00:24 -07:00
f5c73f69fd Merge branch 'dl/stash-show-untracked'
"git stash show" learned to optionally show untracked part of the
stash.

* dl/stash-show-untracked:
  stash show: learn stash.showIncludeUntracked
  stash show: teach --include-untracked and --only-untracked
2021-03-22 14:00:24 -07:00
dd4048d1c7 Merge branch 'en/ort-perf-batch-8'
Rename detection rework continues.

* en/ort-perf-batch-8:
  diffcore-rename: compute dir_rename_guess from dir_rename_counts
  diffcore-rename: limit dir_rename_counts computation to relevant dirs
  diffcore-rename: compute dir_rename_counts in stages
  diffcore-rename: extend cleanup_dir_rename_info()
  diffcore-rename: move dir_rename_counts into dir_rename_info struct
  diffcore-rename: add function for clearing dir_rename_count
  Move computation of dir_rename_count from merge-ort to diffcore-rename
  diffcore-rename: add a mapping of destination names to their indices
  diffcore-rename: provide basic implementation of idx_possible_rename()
  diffcore-rename: use directory rename guided basename comparisons
2021-03-22 14:00:24 -07:00
24119d9d7b Merge branch 'ab/grep-pcre2-allocfix'
Updates to memory allocation code around the use of pcre2 library.

* ab/grep-pcre2-allocfix:
  grep/pcre2: move definitions of pcre2_{malloc,free}
  grep/pcre2: move back to thread-only PCREv2 structures
  grep/pcre2: actually make pcre2 use custom allocator
  grep/pcre2: use pcre2_maketables_free() function
  grep/pcre2: use compile-time PCREv2 version test
  grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  grep/pcre2: prepare to add debugging to pcre2_malloc()
  grep/pcre2: correct reference to grep_init() in comment
  grep/pcre2: drop needless assignment to NULL
  grep/pcre2: drop needless assignment + assert() on opt->pcre2
2021-03-22 14:00:23 -07:00
e8d5a423ca Merge branch 'jk/perf-in-worktrees'
Perf test update to work better in secondary worktrees.

* jk/perf-in-worktrees:
  t/perf: avoid copying worktree files from test repo
  t/perf: handle worktrees as test repos
2021-03-22 14:00:23 -07:00
d20fa3cf9d Merge branch 'ds/commit-graph-generation-config'
A new configuration variable has been introduced to allow choosing
which version of the generation number gets used in the
commit-graph file.

* ds/commit-graph-generation-config:
  commit-graph: use config to specify generation type
  commit-graph: create local repository pointer
2021-03-22 14:00:23 -07:00
52182e3b1f Merge branch 'ab/remote-write-config-in-camel-case'
Update C code that sets a few configuration variables when a remote
is configured so that it spells configuration variable names in the
canonical camelCase.

* ab/remote-write-config-in-camel-case:
  remote: write camel-cased *.pushRemote on rename
  remote: add camel-cased *.tagOpt key, like clone
2021-03-22 14:00:23 -07:00
2435feaa20 Merge branch 'mt/cleanly-die-upon-missing-required-filter'
We had a code to diagnose and die cleanly when a required
clean/smudge filter is missing, but an assert before that
unnecessarily fired, hiding the end-user facing die() message.

* mt/cleanly-die-upon-missing-required-filter:
  convert: fail gracefully upon missing clean cmd on required filter
2021-03-22 14:00:22 -07:00
204333b015 Merge branch 'jk/open-dotgitx-with-nofollow'
It does not make sense to make ".gitattributes", ".gitignore" and
".mailmap" symlinks, as they are supposed to be usable from the
object store (think: bare repositories where HEAD:.mailmap etc. are
used).  When these files are symbolic links, we used to read the
contents of the files pointed by them by mistake, which has been
corrected.

* jk/open-dotgitx-with-nofollow:
  mailmap: do not respect symlinks for in-tree .mailmap
  exclude: do not respect symlinks for in-tree .gitignore
  attr: do not respect symlinks for in-tree .gitattributes
  exclude: add flags parameter to add_patterns()
  attr: convert "macro_ok" into a flags field
  add open_nofollow() helper
2021-03-22 14:00:22 -07:00
2be927f3d1 diff --no-index tests: test mode normalization
When "git diff --no-index X Y" is run the modes of the files being
differ are normalized by canon_mode() in fill_filespec().

I recently broke that behavior in a patch of mine[1] which would pass
all tests, or not, depending on the umask of the git.git checkout.

Let's test for this explicitly. Arguably this should not be the
behavior of "git diff --no-index". We aren't diffing our own objects
or the index, so it might be useful to show mode differences between
files.

On the other hand diff(1) does not do that, and it would be needlessly
distracting when e.g. diffing an extracted tar archive whose contents
is the same, but whose file modes are different.

1. https://lore.kernel.org/git/20210316155829.31242-2-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 12:22:26 -07:00
540cdc11ad pack-bitmap: avoid traversal of objects referenced by uninteresting tag
When preparing the bitmap walk, we first establish the set of of have
and want objects by iterating over the set of pending objects: if an
object is marked as uninteresting, it's declared as an object we already
have, otherwise as an object we want. These two sets are then used to
compute which transitively referenced objects we need to obtain.

One special case here are tag objects: when a tag is requested, we
resolve it to its first not-tag object and add both resolved objects as
well as the tag itself into either the have or want set. Given that the
uninteresting-property always propagates to referenced objects, it is
clear that if the tag is uninteresting, so are its children and vice
versa. But we fail to propagate the flag, which effectively means that
referenced objects will always be interesting except for the case where
they have already been marked as uninteresting explicitly.

This mislabeling does not impact correctness: we now have it in our
"wants" set, and given that we later do an `AND NOT` of the bitmaps of
"wants" and "haves" sets it is clear that the result must be the same.
But we now start to needlessly traverse the tag's referenced objects in
case it is uninteresting, even though we know that each referenced
object will be uninteresting anyway. In the worst case, this can lead to
a complete graph walk just to establish that we do not care for any
object.

Fix the issue by propagating the `UNINTERESTING` flag to pointees of tag
objects and add a benchmark with negative revisions to p5310. This shows
some nice performance benefits, tested with linux.git:

Test                                                          HEAD~                  HEAD
---------------------------------------------------------------------------------------------------------------
5310.3: repack to disk                                        193.18(181.46+16.42)   194.61(183.41+15.83) +0.7%
5310.4: simulated clone                                       25.93(24.88+1.05)      25.81(24.73+1.08) -0.5%
5310.5: simulated fetch                                       2.64(5.30+0.69)        2.59(5.16+0.65) -1.9%
5310.6: pack to file (bitmap)                                 58.75(57.56+6.30)      58.29(57.61+5.73) -0.8%
5310.7: rev-list (commits)                                    1.45(1.18+0.26)        1.46(1.22+0.24) +0.7%
5310.8: rev-list (objects)                                    15.35(14.22+1.13)      15.30(14.23+1.07) -0.3%
5310.9: rev-list with tag negated via --not --all (objects)   22.49(20.93+1.56)      0.11(0.09+0.01) -99.5%
5310.10: rev-list with negative tag (objects)                 0.61(0.44+0.16)        0.51(0.35+0.16) -16.4%
5310.11: rev-list count with blob:none                        12.15(11.19+0.96)      12.18(11.19+0.99) +0.2%
5310.12: rev-list count with blob:limit=1k                    17.77(15.71+2.06)      17.75(15.63+2.12) -0.1%
5310.13: rev-list count with tree:0                           1.69(1.31+0.38)        1.68(1.28+0.39) -0.6%
5310.14: simulated partial clone                              20.14(19.15+0.98)      19.98(18.93+1.05) -0.8%
5310.16: clone (partial bitmap)                               12.78(13.89+1.07)      12.72(13.99+1.01) -0.5%
5310.17: pack to file (partial bitmap)                        42.07(45.44+2.72)      41.44(44.66+2.80) -1.5%
5310.18: rev-list with tree filter (partial bitmap)           0.44(0.29+0.15)        0.46(0.32+0.14) +4.5%

While most benchmarks are probably in the range of noise, the newly
added 5310.9 and 5310.10 benchmarks consistenly perform better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 12:10:56 -07:00
1b0d9545bb remote-curl: fall back to basic auth if Negotiate fails
When the username and password are supplied in a url like this
https://myuser:secret@git.exampe/myrepo.git and the server supports the
negotiate authenticaten method, git does not fall back to basic auth and
libcurl hardly tries to authenticate with the negotiate method.

Stop using the Negotiate authentication method after the first failure
because if it fails on the first try it will never succeed.

Signed-off-by: Christopher Schenk <christopher@cschenk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 11:55:41 -07:00
36a7eb6876 t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
Create t0052-simple-ipc.sh with unit tests for the "simple-ipc" mechanism.

Create t/helper/test-simple-ipc test tool to exercise the "simple-ipc"
functions.

When the tool is invoked with "run-daemon", it runs a server to listen
for "simple-ipc" connections on a test socket or named pipe and
responds to a set of commands to exercise/stress the communication
setup.

When the tool is invoked with "start-daemon", it spawns a "run-daemon"
command in the background and waits for the server to become ready
before exiting.  (This helps make unit tests in t0052 more predictable
and avoids the need for arbitrary sleeps in the test script.)

The tool also has a series of client "send" commands to send commands
and data to a server instance.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 11:52:54 -07:00
7cd5dbcaba simple-ipc: add Unix domain socket implementation
Create Unix domain socket based implementation of "simple-ipc".

A set of `ipc_client` routines implement a client library to connect
to an `ipc_server` over a Unix domain socket, send a simple request,
and receive a single response.  Clients use blocking IO on the socket.

A set of `ipc_server` routines implement a thread pool to listen for
and concurrently service client connections.

The server creates a new Unix domain socket at a known location.  If a
socket already exists with that name, the server tries to determine if
another server is already listening on the socket or if the socket is
dead.  If socket is busy, the server exits with an error rather than
stealing the socket.  If the socket is dead, the server creates a new
one and starts up.

If while running, the server detects that its socket has been stolen
by another server, it automatically exits.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 11:52:54 -07:00
271cb303a5 diff --no-index tests: add test for --exit-code
Add a test for --exit-code working with --no-index. There's no reason
to suppose it wouldn't, but we weren't testing for it anywhere in our
tests. Let's fix that blind spot.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 11:48:41 -07:00
68ffe095a2 transport: also free remote_refs in transport_disconnect()
transport_get_remote_refs() can populate the transport struct's
remote_refs. transport_disconnect() is already responsible for most of
transport's cleanup - therefore we also take care of freeing remote_refs
there.

There are 2 locations where transport_disconnect() is called before
we're done using the returned remote_refs. This patch changes those
callsites to only call transport_disconnect() after the returned refs
are no longer being used - which is necessary to safely be able to
free remote_refs during transport_disconnect().

This commit fixes the following leak which was found while running
t0000, but is expected to also fix the same pattern of leak in all
locations that use transport_get_remote_refs():

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
    #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
    #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
    #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
    #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
    #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
    #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
    #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
    #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
    #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-21 14:39:10 -07:00
64cc539fd2 parse-options: don't leak alias help messages
preprocess_options() allocates new strings for help messages for
OPTION_ALIAS. Therefore we also need to clean those help messages up
when freeing the returned options.

First introduced in:
  7c280589cf (parse-options: teach "git cmd -h" to show alias as alias, 2020-03-16)

The preprocessed options themselves no longer contain any indication
that a given option is/was an alias - therefore we add a new flag to
indicate former aliases. (An alternative approach would be to look back
at the original options to determine which options are aliases - but
that seems like a fragile approach. Or we could even look at the
alias_groups list - which might be less fragile, but would be slower
as it requires nested looping.)

As far as I can tell, parse_options() is only ever used once per
command, and the help messages are small - hence this leak has very
little impact.

This leak was found while running t0001. LSAN output can be found below:

Direct leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9aae36 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x939d8d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x93b936 in strbuf_vaddf /home/ahunt/oss-fuzz/git/strbuf.c:392:3
    #4 0x93b7ff in strbuf_addf /home/ahunt/oss-fuzz/git/strbuf.c:333:2
    #5 0x86747e in preprocess_options /home/ahunt/oss-fuzz/git/parse-options.c:666:3
    #6 0x866ed2 in parse_options /home/ahunt/oss-fuzz/git/parse-options.c:847:17
    #7 0x51c4a7 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:989:9
    #8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #12 0x69c9fe in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #13 0x7fdac42d4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-21 14:39:10 -07:00
0171dbcb42 parse-options: convert bitfield values to use binary shift
Because it's easier to read, but also likely to be easier to maintain.
I am making this change because I need to add a new flag in a later
commit.

Also add a trailing comma to the last enum entry to simplify addition of
new flags.

This change was originally suggested by Peff in:
https://public-inbox.org/git/YEZ%2FBWWbpfVwl6nO@coredump.intra.peff.net/

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-21 14:39:10 -07:00
47957485b3 tree.h API: simplify read_tree_recursive() signature
Simplify the signature of read_tree_recursive() to omit the "base",
"baselen" and "stage" arguments. No callers of it use these parameters
for anything anymore.

The last function to call read_tree_recursive() with a non-"" path was
read_tree_recursive() itself, but that was changed in
ffd31f661d (Reimplement read_tree_recursive() using
tree_entry_interesting(), 2011-03-25).

The last user of the "stage" parameter went away in the last commit,
and even that use was mere boilerplate.

So let's remove those and rename the read_tree_recursive() function to
just read_tree(). We had another read_tree() function that I've
refactored away in preceding commits, since all in-tree users read
trees recursively with a callback we can change the name to signify
that this is the norm.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
6c9fc42e9f tree.h API: expose read_tree_1() as read_tree_at()
Rename the static read_tree_1() function to read_tree_at(). This
function works just like read_tree_recursive(), except you provide
your own strbuf.

This step doesn't make much sense now, but in follow-up commits I'll
remove the base/baselen/stage arguments to read_tree_recursive(). At
that point an anticipated in-tree user[1] for the old
read_tree_recursive() couldn't provide a path to start the
traversal.

Let's give them a function to do so with an API that makes more sense
for them, by taking a strbuf we should be able to avoid more casting
and/or reallocations in the future.

1. https://lore.kernel.org/git/xmqqft106sok.fsf@gitster.g

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
7367d88261 archive: stop passing "stage" through read_tree_recursive()
The "stage" variable being passed around in the archive code has only
ever been an elaborate way to hardcode the value "0".

This code was added in its original form in e4fbbfe9ec (Add
git-zip-tree, 2006-08-26), at which point a hardcoded "0" would be
passed down through read_tree_recursive() to write_zip_entry().

It was then diligently added to the "struct directory" in
ed22b4173b (archive: support filtering paths with glob, 2014-09-21),
but we were still not doing anything except passing it around as-is.

Let's stop doing that in the code internal to archive.c, we'll still
feed "0" to read_tree_recursive() itself, but won't use it. That we're
providing it at all to read_tree_recursive() will be changed in a
follow-up commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
9614ad3ce0 ls-files: refactor away read_tree()
Refactor away the read_tree() function into its only user,
overlay_tree_on_index().

First, change read_one_entry_opt() to use the strbuf parameter
read_tree_recursive() passes down in place. This finishes up a partial
refactoring started in 6a0b0b6de9 (tree.c: update read_tree_recursive
callback to pass strbuf as base, 2014-11-30).

Moving the rest into overlay_tree_on_index() makes this index juggling
we're doing easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
fcc7c12f11 ls-files: don't needlessly pass around stage variable
Now that read_tree() has been moved to ls-files.c we can get rid of
the stage != 1 case that'll never happen.

Let's not use read_tree_recursive() as a pass-through to pass "stage =
1" either. For now we'll pass an unused "stage = 0" for consistency
with other read_tree_recursive() callers, that argument will be
removed in a follow-up commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
eefadd18e1 tree.c API: move read_tree() into builtin/ls-files.c
Since the read_tree() API was added around the same time as
read_tree_recursive() in 94537c78a8 (Move "read_tree()" to
"tree.c"[...], 2005-04-22) and b12ec373b8 ([PATCH] Teach read-tree
about commit objects, 2005-04-20) things have gradually migrated over
to the read_tree_recursive() version.

Now builtin/ls-files.c is the last user of this code, let's move all
the relevant code there. This allows for subsequent simplification of
it, and an eventual move to read_tree_recursive().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:25 -07:00
8de78218c5 ls-files tests: add meaningful --with-tree tests
Add tests for "ls-files --with-tree". There was effectively no
coverage for any normal usage of this command, only the tests added in
54e1abce90 (Add test case for ls-files --with-tree, 2007-10-03) for
an obscure bug.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:25 -07:00
dcc0a86f2f show tests: add test for "git show <tree>"
Add missing tests for showing a tree with "git show". Let's test for
showing a tree, two trees, and that doing so doesn't recurse.

The only tests for this code added in 5d7eeee2ac (git-show: grok
blobs, trees and tags, too, 2006-12-14) were the tests in
t7701-repack-unpack-unreachable.sh added in ccc1297226 (repack:
modify behavior of -A option to leave unreferenced objects unpacked,
2008-05-09).

Let's add this common mode of operation to the "show" tests
themselves. It's more obvious, and the tests in
t7701-repack-unpack-unreachable.sh happily pass if we start buggily
emitting trees recursively.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:25 -07:00
f3b964a07e Add testing with merge-ort merge strategy
In preparation for switching from merge-recursive to merge-ort as the
default strategy, have the testsuite default to running with merge-ort.
Keep coverage of the recursive backend by having the linux-gcc job run
with it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
259490e572 t6423: mark remaining expected failure under merge-ort as such
When we started on merge-ort, thousands of tests failed when run with
the GIT_TEST_MERGE_ALGORITHM=ort flag; with so many, it didn't make
sense to flip all their test expectations.  The ones in t6409, t6418,
and the submodule tests are being handled by an independent in-flight
topic ("Complete merge-ort implemenation...almost").  The ones in
t6423 were left out of the other series because other ongoing series
that this commit depends upon were addressing those.  Now that we only
have one remaining test failure in t6423, let's mark it as such.

This remaining test will be fixed by a future optimization series, but
since merge-recursive doesn't pass this test either, passing it is not
necessary for declaring merge-ort ready for general use.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
41376b58e6 Revert "merge-ort: ignore the directory rename split conflict for now"
This reverts commit 5ced7c3da0, which was
put in place as a temporary measure to avoid optimizations unstably
erroring out on no destination having a majority of the necessary
renames for directories that had no new files and thus no need for
directory rename detection anyway.  Now that optimizations are in place
to prevent us from trying to compute directory rename count computations
for directories that do not need it, we can undo this temporary measure.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
816147e7ba merge-recursive: add a bunch of FIXME comments documenting known bugs
The plan is to just delete merge-recursive, but not until everyone is
comfortable with merge-ort as a replacement.  Given that I haven't
switched all callers of merge-recursive over yet (e.g. git-am still uses
merge-recursive), maybe there's some value documenting known bugs in the
algorithm in case we end up keeping it or someone wants to dig it up in
the future.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
5291828df8 merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
There are a variety of questions users might ask while resolving
conflicts:
  * What changes have been made since the previous (first) parent?
  * What changes are staged?
  * What is still unstaged? (or what is still conflicted?)
  * What changes did I make to resolve conflicts so far?
The first three of these have simple answers:
  * git diff HEAD
  * git diff --cached
  * git diff
There was no way to answer the final question previously.  Adding one
is trivial in merge-ort, since it works by creating a tree representing
what should be written to the working copy complete with conflict
markers.  Simply write that tree to .git/AUTO_MERGE, allowing users to
answer the fourth question with
  * git diff AUTO_MERGE

I avoided using a name like "MERGE_AUTO", because that would be
merge-specific (much like MERGE_HEAD, REBASE_HEAD, REVERT_HEAD,
CHERRY_PICK_HEAD) and I wanted a name that didn't change depending on
which type of operation the merge was part of.

Ensure that paths which clean out other temporary operation-specific
files (e.g. CHERRY_PICK_HEAD, MERGE_MSG, rebase-merge/ state directory)
also clean out this AUTO_MERGE file.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
aa2faac03a t: mark several submodule merging tests as fixed under merge-ort
merge-ort handles submodules (and directory/file conflicts in general)
differently than merge-recursive does; it basically puts all the special
handling for different filetypes into one place in the codebase instead
of needing special handling for different filetypes in many different
code paths.  This one code path in merge-ort could perhaps use some work
still (there are still test_expect_failure cases in the testsuite), but
it passes all the tests that merge-recursive does as well as 12
additional ones that merge-recursive fails.  Mark those 12 tests as
test_expect_success under merge-ort.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
66b209b86a merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
When merge conflicts occur in paths removed by a sparse-checkout, we
need to unsparsify those paths (clear the SKIP_WORKTREE bit), and write
out the conflicted file to the working copy.  In the very unlikely case
that someone manually put a file into the working copy at the location
of the SKIP_WORKTREE file, we need to avoid overwriting whatever edits
they have made and move that file to a different location first.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
8ddc20b896 t6428: new test for SKIP_WORKTREE handling and conflicts
If there is a conflict during a merge for a SKIP_WORKTREE entry, we
expect that file to be written to the working copy and have the
SKIP_WORKTREE bit cleared in the index.  If the user had manually
created a file in the working tree despite SKIP_WORKTREE being set, we
do not want to clobber their changes to that file, but want to move it
out of the way.  Add tests that check for these behaviors.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
3639dfb3a8 merge-ort: support subtree shifting
merge-recursive has some simple code to support subtree shifting; copy
it over to merge-ort.  This fixes t6409.12 under
GIT_TEST_MERGE_ALGORITHM=ort.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
3860220bfa merge-ort: let renormalization change modify/delete into clean delete
When we have a modify/delete conflict, but the only change to the
modification is e.g. change of line endings, then if renormalization is
requested then we should be able to recognize such a case as a
not-modified/delete and resolve the conflict automatically.

This fixes t6418.10 under GIT_TEST_MERGE_ALGORITHM=ort.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
1218b3ab86 merge-ort: have ll_merge() use a special attr_index for renormalization
ll_merge() needs an index when renormalization is requested.  Create one
specifically for just this purpose with just the one needed entry.  This
fixes t6418.4 and t6418.5 under GIT_TEST_MERGE_ALGORITHM=ort.

NOTE 1: Even if the user has a working copy or a real index (which is
not a given as merge-ort can be used in bare repositories), we
explicitly ignore any .gitattributes file from either of these
locations.  merge-ort can be used to merge two branches that are
unrelated to HEAD, so .gitattributes from the working copy and current
index should not be considered relevant.

NOTE 2: Since we are in the middle of merging, there is a risk that
.gitattributes itself is conflicted...leaving us with an ill-defined
situation about how to perform the rest of the merge.  It could be that
the .gitattributes file does not even exist on one of the sides of the
merge, or that it has been modified on both sides.  If it's been
modified on both sides, it's possible that it could itself be merged
cleanly, though it's also possible that it only merges cleanly if you
use the right version of the .gitattributes file to drive the merge.  It
gets kind of complicated.  The only test we ever had that attempted to
test behavior in this area was seemingly unaware of the undefined
behavior, but knew the test wouldn't work for lack of attribute handling
support, marked it as test_expect_failure from the beginning, but
managed to fail for several reasons unrelated to attribute handling.
See commit 6f6e7cfb52 ("t6038: remove problematic test", 2020-08-03) for
details.  So there are probably various ways to improve what
initialize_attr_index() picks in the case of a conflicted .gitattributes
but for now I just implemented something simple -- look for whatever
.gitattributes file we can find in any of the higher order stages and
use it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
ea305a68fd merge-ort: add a special minimal index just for renormalization
renormalize_buffer() requires an index_state, which is something that
merge-ort does not operate with.  However, all the renormalization code
needs is an index with a .gitattributes file...plus a little bit of
setup.  Create such an index, along with the deallocation and
attr_direction handling.

A subsequent commit will add a function to finish the initialization
of this index.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
72b3091040 merge-ort: use STABLE_QSORT instead of QSORT where required
rename/rename conflict handling depends on the fact that if both sides
renamed the same path, that the one on the MERGE_SIDE1 will appear first
in the combined diff_queue_struct passed to process_renames().  Since we
add all pairs from MERGE_SIDE1 to combined first, and then all pairs
from MERGE_SIDE2, and then sort based on filename, this will only be
true if the sort used is stable.  This was found due to the fact that
Mac, unlike Linux, apparently has a system-defined qsort that is not
stable.

While we are at it, review the other callers of QSORT and add comments
about why they can remain as calls to QSORT instead of being modified
to call STABLE_QSORT.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:39 -07:00
ef486a9ecf Merge branch 'tb/git-mv-icase-fix'
Fix a corner case bug in "git mv" on case insensitive systems,
which was introduced in 2.29 timeframe.

* tb/git-mv-icase-fix:
  git mv foo FOO ; git mv foo bar gave an assert
2021-03-19 15:25:40 -07:00
98164e9585 The first batch in 2.32 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19 15:25:40 -07:00
bfcc6e2a68 Merge branch 'rs/xcalloc-takes-nelem-first'
Code cleanup.

* rs/xcalloc-takes-nelem-first:
  fix xcalloc() argument order
2021-03-19 15:25:39 -07:00
af107029b1 Merge branch 'ah/make-fuzz-all-doc-update'
Update insn in Makefile comments to run fuzz-all target.

* ah/make-fuzz-all-doc-update:
  Makefile: update 'make fuzz-all' docs to reflect modern clang
2021-03-19 15:25:39 -07:00
c691e918f4 Merge branch 'jk/slimmed-down'
Unused code removal.

* jk/slimmed-down:
  vcs-svn: remove header files as well
2021-03-19 15:25:38 -07:00
92ccd7b752 Merge branch 'rs/calloc-array'
CALLOC_ARRAY() macro replaces many uses of xcalloc().

* rs/calloc-array:
  cocci: allow xcalloc(1, size)
  use CALLOC_ARRAY
  git-compat-util.h: drop trailing semicolon from macro definition
2021-03-19 15:25:38 -07:00
a8a0ac3234 Merge branch 'rs/avoid-null-statement-after-macro-call'
Fix macros that can silently inject unintended null-statements.

* rs/avoid-null-statement-after-macro-call:
  mem-pool: drop trailing semicolon from macro definition
  block-sha1: drop trailing semicolon from macro definition
2021-03-19 15:25:38 -07:00
948e8ac534 Merge branch 'km/config-doc-typofix'
Docfix.

* km/config-doc-typofix:
  config.txt: add missing period
2021-03-19 15:25:38 -07:00
cc930b7472 Merge branch 'jt/clone-unborn-head'
Test fix.

* jt/clone-unborn-head:
  t5606: run clone branch name test with protocol v2
2021-03-19 15:25:38 -07:00
1dd4e74522 Merge branch 'js/fsmonitor-unpack-fix'
The data structure used by fsmonitor interface was not properly
duplicated during an in-core merge, leading to use-after-free etc.

* js/fsmonitor-unpack-fix:
  fsmonitor: do not forget to release the token in `discard_index()`
  fsmonitor: fix memory corruption in some corner cases
2021-03-19 15:25:37 -07:00
35381b13da Merge branch 'jk/bisect-peel-tag-fix'
"git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well.  This regression
has been corrected.

* jk/bisect-peel-tag-fix:
  bisect: peel annotated tags to commits
2021-03-19 15:25:37 -07:00
8779c141da Merge branch 'jh/fsmonitor-prework'
The fsmonitor interface read from its input without making sure
there is something to read from.  This bug is new in 2.31
timeframe.

* jh/fsmonitor-prework:
  fsmonitor: avoid global-buffer-overflow READ when checking trivial response
2021-03-19 15:25:37 -07:00
eabacfd9cb Merge branch 'jc/calloc-fix'
Code clean-up.

* jc/calloc-fix:
  xcalloc: use CALLOC_ARRAY() when applicable
2021-03-19 15:25:37 -07:00
14e7b8344f builtin/pack-objects.c: ignore missing links with --stdin-packs
When 'git pack-objects --stdin-packs' encounters a commit in a pack, it
marks it as a starting point of a best-effort reachability traversal
that is used to populate the name-hash of the objects listed in the
given packs.

The traversal expects that it should be able to walk the ancestors of
all commits in a pack without issue. Ordinarily this is the case, but it
is possible to having missing parents from an unreachable part of the
repository. In that case, we'd consider any missing objects in the
unreachable portion of the graph to be junk.

This should be handled gracefully: since the traversal is best-effort
(i.e., we don't strictly need to fill in all of the name-hash fields),
we should simply ignore any missing links.

This patch does that (by setting the 'ignore_missing_links' bit on the
rev_info struct), and ensures we don't regress in the future by adding a
test which demonstrates this case.

It is a little over-eager, since it will also ignore missing links in
reachable parts of the packs (which would indicate a corrupted
repository), but '--stdin-packs' is explicitly *not* about reachability.
So this step isn't making anything worse for a repository which contains
packs missing reachable objects (since we never drop objects with
'--stdin-packs').

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19 11:19:29 -07:00
6534d436a2 INSTALL: note on using Asciidoctor to build doc
Note on using Asciidoctor to build documentation suite.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19 10:49:20 -07:00
9bd342137e diffcore-rename: determine which relevant_sources are no longer relevant
As noted a few commits ago ("diffcore-rename: only compute
dir_rename_count for relevant directories"), when a source file rename
is used as part of directory rename detection, we need to increment
counts for each ancestor directory in dirs_removed with value
RELEVANT_FOR_SELF.  However, a few commits ago ("diffcore-rename: check
if we have enough renames for directories early on"), we may have
downgraded all relevant ancestor directories from RELEVANT_FOR_SELF to
RELEVANT_FOR_ANCESTOR.

For a given file, if no ancestor directory is found in dirs_removed with
a value of RELEVANT_FOR_SELF, then we can downgrade
relevant_source[PATH] from RELEVANT_LOCATION to RELEVANT_NO_MORE.  This
means we can skip detecting a rename for that particular path (and any
other paths in the same directory).

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        5.680 s ±  0.096 s     5.665 s ±  0.129 s
    mega-renames:     13.812 s ±  0.162 s    11.435 s ±  0.158 s
    just-one-mega:   506.0  ms ±  3.9  ms   494.2  ms ±  6.1  ms

While this improvement looks rather modest for these testcases (because
all the previous optimizations were sufficient to nearly remove all time
spent in rename detection already),  consider this alternative testcase
tweaked from the ones in commit 557ac0350d as follows

    <Same initial setup as commit 557ac0350d, then...>
    $ git switch -c add-empty-file v5.5
    $ >drivers/gpu/drm/i915/new-empty-file
    $ git add drivers/gpu/drm/i915/new-empty-file
    $ git commit -m "new file"
    $ git switch 5.4-rename
    $ git cherry-pick --strategy=ort add-empty-file

For this testcase, we see the following improvement:

                            Before                  After
    pick-empty:        1.936 s ±  0.024 s     688.1 ms ±  4.2 ms

So roughly a factor of 3 speedup.  At $DAYJOB, there was a particular
repository and cherry-pick that inspired this optimization; for that
case I saw a speedup factor of 7 with this optimization.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:56 -07:00
ec59da6015 merge-ort: record the reason that we want a rename for a file
There are two different reasons we might want a rename for a file -- for
three-way content merging or as part of directory rename detection.
Record the reason.  diffcore-rename will potentially be able to filter
some of the ones marked as needed only for directory rename detection,
if it can determine those directory renames based solely on renames
found via exact rename detection and basename-guided rename detection.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:56 -07:00
bf238b7137 diffcore-rename: add computation of number of unknown renames
The previous commit can only be effective if we have a computation of
the number of paths under a given directory which are still have pending
renames, and expected this number to be recorded in the dir_rename_count
map under the key UNKNOWN_DIR.  Add the code necessary to compute these
values.

Note that this change means dir_rename_count might have a directory
whose only entry (for UNKNOWN_DIR) was removed by the time merge-ort
goes to check it.  To account for this, merge-ort needs to check for the
case where the max count is 0.

With this change we are now computing the necessary value for each
directory in dirs_removed, but are not using that value anywhere.  The
next two commits will make use of the values stored in dirs_removed in
order to compute whether each relevant_source (that is needed only for
directory rename detection) has become unnecessary.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:56 -07:00
0491d39297 diffcore-rename: check if we have enough renames for directories early on
As noted in the past few commits, if we can determine that a directory
already has enough renames to determine how directory rename detection
will be decided for that directory, then we can mark that directory as
no longer needing any more renames detected for files underneath it.
For such directories, we change the value in the dirs_removed map from
RELEVANT_TO_SELF to RELEVANT_FOR_ANCESTOR.

A subsequent patch will use this information while iterating over the
remaining potential rename sources to mark ones that were only
location_relevant as unneeded if no containing directory is still marked
as RELEVANT_TO_SELF.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:56 -07:00
e54385b97a diffcore-rename: only compute dir_rename_count for relevant directories
When one side adds files to a directory that the other side renamed,
directory rename detection is used to either move the new paths to the
newer directory or warn the user about the fact that another path
location might be better.

If a parent of the given directory had new files added to it, any
renames in the current directory are also part of determining where the
parent directory is renamed to.  Thus, naively, we need to record each
rename N times for a path at depth N.  However, we can use the
additional information added to dirs_removed in the last commit to avoid
traversing all N parent directories in many cases.  Let's use an example
to explain how this works.  If we have a path named
   src/old_dir/a/b/file.c
and src/old_dir doesn't exist on one side of history, but the other
added a file named src/old_dir/newfile.c, then if one side renamed
   src/old_dir/a/b/file.c => source/new_dir/a/b/file.c
then this file would affect potential directory rename detection counts
for
   src/old_dir/a/b => source/new_dir/a/b
   src/old_dir/a   => source/new_dir/a
   src/old_dir     => source/new_dir
   src             => source
adding a weight of 1 to each in dir_rename_counts.  However, if src/
exists on both sides of history, then we don't need to track any entries
for it in dir_rename_counts.  That was implemented previously.  What we
are adding now, is that if no new files were added to src/old_dir/a or
src/old_dir/b, then we don't need to have counts in dir_rename_count
for those directories either.

In short, we only need to track counts in dir_rename_count for
directories whose dirs_removed value is RELEVANT_FOR_SELF.  And as soon
as we reach a directory that isn't in dirs_removed (signalled by
returning the default value of NOT_RELEVANT from strintmap_get()), we
can stop looking any further up the directory hierarchy.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:55 -07:00
fb52938eec merge-ort: record the reason that we want a rename for a directory
When one side of history renames a directory, and the other side of
history added files to the old directory, directory rename detection is
used to warn about the location of the added files so the user can
move them to the old directory or keep them with the new one.

This sets up three different types of directories:
  * directories that had new files added to them
  * directories underneath a directory that had new files added to them
  * directories where no new files were added to it or any leading path

Save this information in dirs_removed; the next several commits will
make use of this information.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:55 -07:00
a49b55d52e merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type
As noted in the previous commit, we want to be able to take advantage of
the "majority rules" portion of directory rename detection to avoid
detecting more renames than necessary.  However, for diffcore-rename to
take advantage of that, it needs to know whether a rename source file
was needed for just directory rename detection reasons, or if it is
wanted for potential three-way content merging.  Modify relevant_sources
from a strset to a strintmap, so we can encode additional information.

We also modify dirs_removed from a strset to a strintmap at the same
time because trying to determine what files are needed for directory
rename detection will require us tracking a bit more information for
each directory.

This commit only changes the types of the two variables from strset to
strintmap; it does not actually store any special values yet and for now
only checks for presence of entries in the strintmap.  Thus, the code is
functionally identical to how it behaved before.  Future commits will
start associating values with each key for these two maps.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:55 -07:00
ae1db7b31c diffcore-rename: take advantage of "majority rules" to skip more renames
In directory rename detection (when a directory is removed on one side
of history and the other side adds new files to that directory), we work
to find where the greatest number of files within that directory were
renamed to so that the new files can be moved with the majority of the
files.

Naively, we can just do this by detecting renames for *all* files within
the removed/renamed directory, looking at all the destination
directories where files within that directory were moved, and if there
is more than one such directory then taking the one with the greatest
number of files as the directory where the old directory was renamed to.

However, sometimes there are enough renames from exact rename detection
or basename-guided rename detection that we have enough information to
determine the majority winner already.  Add a function meant to compute
whether particular renames are still needed based on this majority rules
check.  The next several commits will then add the necessary
infrastructure to get the information we need to compute which
additional rename sources we can skip.

An important side note for future further optimization:

There is a possible improvement to this optimization that I have not yet
attempted and will not be included in this series of patches: we could
first check whether exact renames provide enough information for us to
determine directory renames, and avoid doing basename-guided rename
detection on some or all of the RELEVANT_LOCATION files within those
directories.  In effect, this variant would mean doing the
handle_early_known_dir_renames() both after exact rename detection and
again after basename-guided rename detection, though it would also mean
decrementing the number of "unknown" renames for each rename we found
from basename-guided rename detection.  Adding this additional check for
skippable renames right after exact rename detection might turn out to
be valuable, especially for partial clones where it might allow us to
download certain source files entirely.  However, this particular
optimization was actually the last one I did in original implementation
order, and by the time I implemented this idea, every testcase I had was
sufficiently fast that further optimization was unwarranted.  If future
testcases arise that tax rename detection more heavily (or perhaps
partial clones can benefit from avoiding loading more objects), it may
be worth implementing this more involved variant.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:55 -07:00
27d578d904 t: annotate !PTHREADS tests with !FAIL_PREREQS
Some tests in t5300 and t7810 expect us to complain about a "--threads"
argument when Git is compiled without pthread support. Running these
under GIT_TEST_FAIL_PREREQS produces a confusing failure: we pretend to
the tests that there is no pthread support, so they expect the warning,
but of course the actual build is perfectly happy to respect the
--threads argument.

We never noticed before the recent a926c4b904 (tests: remove most uses
of C_LOCALE_OUTPUT, 2021-02-11), because the tests also were marked as
requiring the C_LOCALE_OUTPUT prerequisite. Which means they'd never
have run in FAIL_PREREQS mode, since it would always pretend that the
locale prereq was not satisfied.

These tests can't possibly work in this mode; it is a mismatch between
what the tests expect and what the build was told to do. So let's just
mark them to be skipped, using the special prereq introduced by
dfe1a17df9 (tests: add a special setup where prerequisites fail,
2019-05-13).

Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:17:30 -07:00
f59d15bb42 convert: add classification for conv_attrs struct
Create `enum conv_attrs_classification` to express the different ways
that attributes are handled for a blob during checkout.

This will be used in a later commit when deciding whether to add a file
to the parallel or delayed queue during checkout. For now, we can also
use it in get_stream_filter_ca() to simplify the function (as the
classifying logic is the same).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:56:40 -07:00
3e9e82c0d8 convert: add get_stream_filter_ca() variant
Like the previous patch, we will also need to call get_stream_filter()
with a precomputed `struct conv_attrs`, when we add support for parallel
checkout workers. So add the _ca() variant which takes the conversion
attributes struct as a parameter.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:56:40 -07:00
55b4ad0ead convert: add [async_]convert_to_working_tree_ca() variants
Separate the attribute gathering from the actual conversion by adding
_ca() variants of the conversion functions. These variants receive a
precomputed 'struct conv_attrs', not relying, thus, on an index state.
They will be used in a future patch adding parallel checkout support,
for two reasons:

- We will already load the conversion attributes in checkout_entry(),
  before conversion, to decide whether a path is eligible for parallel
  checkout. Therefore, it would be wasteful to load them again later,
  for the actual conversion.

- The parallel workers will be responsible for reading, converting and
  writing blobs to the working tree. They won't have access to the main
  process' index state, so they cannot load the attributes. Instead,
  they will receive the preloaded ones and call the _ca() variant of
  the conversion functions. Furthermore, the attributes machinery is
  optimized to handle paths in sequential order, so it's better to leave
  it for the main process, anyway.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:56:40 -07:00
38e95844e8 convert: make convert_attrs() and convert structs public
Move convert_attrs() declaration from convert.c to convert.h, together
with the conv_attrs struct and the crlf_action enum. This function and
the data structures will be used outside convert.c in the upcoming
parallel checkout implementation. Note that crlf_action is renamed to
convert_crlf_action, which is more appropriate for the global namespace.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:56:40 -07:00
7e5aa13d2c fsmonitor: add perf test for git diff HEAD
Update the xargs call so that if your large repo contains
symlinks, test-tool chmtime failure does not end the script.

On Linux
Test                                                          this tree           upstream/master
---------------------------------------------------------------------------------------------------------
7519.4: status (fsmonitor=fsmonitor-watchman)                 0.52(0.43+0.10)     0.53(0.49+0.05) +1.9%
7519.5: status -uno (fsmonitor=fsmonitor-watchman)            0.21(0.15+0.07)     0.22(0.13+0.09) +4.8%
7519.6: status -uall (fsmonitor=fsmonitor-watchman)           1.65(0.93+0.71)     1.69(1.03+0.65) +2.4%
7519.7: status (dirty) (fsmonitor=fsmonitor-watchman)         11.99(11.34+1.58)   11.95(11.02+1.79) -0.3%
7519.8: diff (fsmonitor=fsmonitor-watchman)                   0.25(0.17+0.26)     0.25(0.18+0.26) +0.0%
7519.9: diff HEAD (fsmonitor=fsmonitor-watchman)              0.39(0.25+0.34)     0.89(0.35+0.74) +128.2%
7519.10: diff -- 0_files (fsmonitor=fsmonitor-watchman)       0.16(0.13+0.04)     0.16(0.12+0.05) +0.0%
7519.11: diff -- 10_files (fsmonitor=fsmonitor-watchman)      0.16(0.12+0.05)     0.16(0.12+0.05) +0.0%
7519.12: diff -- 100_files (fsmonitor=fsmonitor-watchman)     0.16(0.12+0.05)     0.16(0.12+0.05) +0.0%
7519.13: diff -- 1000_files (fsmonitor=fsmonitor-watchman)    0.16(0.11+0.06)     0.16(0.12+0.05) +0.0%
7519.14: diff -- 10000_files (fsmonitor=fsmonitor-watchman)   0.18(0.13+0.06)     0.17(0.10+0.08) -5.6%
7519.15: add (fsmonitor=fsmonitor-watchman)                   2.25(1.53+0.68)     2.25(1.47+0.74) +0.0%
7519.18: status (fsmonitor=disabled)                          0.88(0.73+1.03)     0.89(0.67+1.08) +1.1%
7519.19: status -uno (fsmonitor=disabled)                     0.45(0.43+0.89)     0.45(0.34+0.98) +0.0%
7519.20: status -uall (fsmonitor=disabled)                    1.88(1.16+1.58)     1.88(1.22+1.51) +0.0%
7519.21: status (dirty) (fsmonitor=disabled)                  7.53(7.05+2.11)     7.53(6.98+2.04) +0.0%
7519.22: diff (fsmonitor=disabled)                            0.42(0.37+0.92)     0.42(0.38+0.91) +0.0%
7519.23: diff HEAD (fsmonitor=disabled)                       0.44(0.41+0.90)     0.44(0.40+0.91) +0.0%
7519.24: diff -- 0_files (fsmonitor=disabled)                 0.13(0.09+0.05)     0.13(0.09+0.05) +0.0%
7519.25: diff -- 10_files (fsmonitor=disabled)                0.13(0.10+0.04)     0.13(0.10+0.04) +0.0%
7519.26: diff -- 100_files (fsmonitor=disabled)               0.13(0.09+0.05)     0.13(0.10+0.04) +0.0%
7519.27: diff -- 1000_files (fsmonitor=disabled)              0.13(0.09+0.06)     0.13(0.09+0.05) +0.0%
7519.28: diff -- 10000_files (fsmonitor=disabled)             0.14(0.11+0.05)     0.14(0.10+0.05) +0.0%
7519.29: add (fsmonitor=disabled)                             2.43(1.61+1.64)     2.43(1.69+1.57) +0.0%

On linux (2.29.2 vs w/ this patch):
nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff 2>&1 | grep lstat
  0.04    0.000063           3        20         6 lstat
nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff HEAD 2>&1 | grep lstat
 94.98    5.242262          10    523783        13 lstat
nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff 2>&1 | grep lstat
  0.38    0.000032           5         7         3 lstat
nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep lstat
 99.44    0.741892           9     81634        10 lstat

On mac (2.29.2 vs w/ this patch):
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff 2>&1 | grep "^lstat64 "
lstat64                                         8
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff HEAD 2>&1 | grep "^lstat64 "
lstat64                                    120242
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff 2>&1 | grep "^lstat64 "
lstat64                                         4
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep "^lstat64 "
lstat64                                      4497

There are still a bunch of lstats - on directories, but not every file. Progress!

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:31:14 -07:00
0ec9949f78 fsmonitor: add assertion that fsmonitor is valid to check_removed
Validate that fsmonitor is valid to futureproof against bugs where
check_removed might be called from places that haven't refreshed.

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:31:13 -07:00
4f3d6d0261 fsmonitor: skip lstat deletion check during git diff-index
Teach git to honor fsmonitor rather than issuing an lstat
when checking for dirty local deletes. Eliminates O(files)
lstats during `git diff HEAD`

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:31:11 -07:00
fab78a0c3d checkout: don't follow symlinks when removing entries
At 1d718a5108 ("do not overwrite untracked symlinks", 2011-02-20),
symlink.c:check_leading_path() started returning different codes for
FL_ENOENT and FL_SYMLINK. But one of its callers, unlink_entry(), was
not adjusted for this change, so it started to follow symlinks on the
leading path of to-be-removed entries. Fix that and add a regression
test.

Note that since 1d718a5108 check_leading_path() no longer differentiates
the case where it found a symlink in the path's leading components from
the cases where it found a regular file or failed to lstat() the
component. So, a side effect of this current patch is that
unlink_entry() now returns early in all of these three cases. And
because we no longer try to unlink such paths, we also don't get the
warning from remove_or_warn().

For the regular file and symlink cases, it's questionable whether the
warning was useful in the first place: unlink_entry() removes tracked
paths that should no longer be present in the state we are checking out
to. If the path had its leading dir replaced by another file, it means
that the basename already doesn't exist, so there is no need for a
warning. Sure, we are leaving a regular file or symlink behind at the
path's dirname, but this file is either untracked now (so again, no
need to warn), or it will be replaced by a tracked file during the next
phase of this checkout operation.

As for failing to lstat() one of the leading components, the basename
might still exist only we cannot unlink it (e.g. due to the lack of the
required permissions). Since the user expect it to be removed
(especially with checkout's --no-overlay option), add back the warning
in this more relevant case.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 12:58:10 -07:00
462b4e8dfd symlinks: update comment on threaded_check_leading_path()
Since 1d718a5108 ("do not overwrite untracked symlinks", 2011-02-20),
the comment on top of threaded_check_leading_path() is outdated and no
longer reflects the behavior of this function. Let's updated it to avoid
confusions. While we are here, also remove some duplicated comments to
avoid similar maintenance problems.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 12:58:08 -07:00
fb79f5bff7 fsck.c: refactor and rename common config callback
Refactor code I recently changed in 1f3299fda9 (fsck: make
fsck_config() re-usable, 2021-01-05) so that I could use fsck's config
callback in mktag in 1f3299fda9 (fsck: make fsck_config() re-usable,
2021-01-05).

I don't know what I was thinking in structuring the code this way, but
it clearly makes no sense to have an fsck_config_internal() at all
just so it can get a fsck_options when git_config() already supports
passing along some void* data.

Let's just make use of that instead, which gets us rid of the two
wrapper functions, and brings fsck's common config callback in line
with other such reusable config callbacks.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 14:02:43 -07:00
4abc57848d fsmonitor: do not forget to release the token in discard_index()
In 56c6910028 (fsmonitor: change last update timestamp on the
index_state to opaque token, 2020-01-07), we forgot to adjust
`discard_index()` to release the "last-update" token: it is no longer a
64-bit number, but a free-form string that has been allocated.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 12:19:28 -07:00
3dfd30598b fsmonitor: fix memory corruption in some corner cases
In 56c6910028 (fsmonitor: change last update timestamp on the
index_state to opaque token, 2020-01-07), we forgot to adjust the part
of `unpack_trees()` that copies the FSMonitor "last-update" information
that we copy from the source index to the result index since 679f2f9fdd
(unpack-trees: skip stat on fsmonitor-valid files, 2019-11-20).

Since the "last-update" information is no longer a 64-bit number, but a
free-form string that has been allocated, we need to duplicate it rather
than just copying it.

This is important because there _are_ cases when `unpack_trees()` will
perform a oneway merge that implicitly calls `refresh_fsmonitor()`
(which will allocate that "last-update" token). This happens _after_
that token was copied into the result index. However, we _then_ call
`check_updates()` on that index, which will _also_ call
`refresh_fsmonitor()`, accessing the "last-update" string, which by now
would be released already.

In the instance that lead to this patch, this caused a segmentation
fault during a lengthy, complicated rebase involving the todo command
`reset` that (crucially) had to updated many files. Unfortunately, it
seems very hard to trigger that crash, therefore this patch is not
accompanied by a regression test.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 12:19:26 -07:00
cfd409ed09 config.txt: add missing period
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 11:25:15 -07:00
7730f85594 bisect: peel annotated tags to commits
This patch fixes a bug where git-bisect doesn't handle receiving
annotated tags as "git bisect good <tag>", etc. It's a regression in
27257bc466 (bisect--helper: reimplement `bisect_state` & `bisect_head`
shell functions in C, 2020-10-15).

The original shell code called:

  sha=$(git rev-parse --verify "$rev^{commit}") ||
          die "$(eval_gettext "Bad rev input: \$rev")"

which will peel the input to a commit (or complain if that's not
possible). But the C code just calls get_oid(), which will yield the oid
of the tag.

The fix is to peel to a commit. The error message here is a little
non-idiomatic for Git (since it starts with a capital). I've mostly left
it, as it matches the other converted messages (like the "Bad rev input"
we print when get_oid() fails), though I did add an indication that it
was the peeling that was the problem. It might be worth taking a pass
through this converted code to modernize some of the error messages.

Note also that the test does a bare "grep" (not i18ngrep) on the
expected "X is the first bad commit" output message. This matches the
rest of the test script.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 11:24:08 -07:00
5f70859c15 t5606: run clone branch name test with protocol v2
4f37d45706 ("clone: respect remote unborn HEAD", 2021-02-05) introduces
a new feature (if the remote has an unborn HEAD, e.g. when the remote
repository is empty, use it as the name of the branch) that only works
in protocol v2, but did not ensure that one of its tests always uses
protocol v2, and thus that test would fail if
GIT_TEST_PROTOCOL_VERSION=0 (or 1) is used. Therefore, add "-c
protocol.version=2" to the appropriate test.

(The rest of the tests from that commit have "-c protocol.version=2"
already added.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 11:19:36 -07:00
116affac3f mem-pool: drop trailing semicolon from macro definition
Allow BLOCK_GROWTH_SIZE to be used like an integer literal by removing
the trailing semicolon from its definition.  Also wrap the expression in
parentheses, to allow it to be used with operators without leading to
unexpected results.  It doesn't matter for the current use site, but
make it follow standard macro rules anyway to avoid future surprises.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 10:20:16 -07:00
3d8cbbf2c3 block-sha1: drop trailing semicolon from macro definition
23119ffb4e (block-sha1: put expanded macro parameters in parentheses,
2012-07-22) added a trailing semicolon to the definition of SHA_MIX
without explanation.  It doesn't matter with the current code, but make
sure to avoid potential surprises by removing it again.

This allows the macro to be used almost like a function: Users can
combine it with operators of their choice, but still must not pass an
expression with side-effects as a parameter, as it would be evaluated
multiple times.

Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 10:20:01 -07:00
097ea2c848 fsmonitor: avoid global-buffer-overflow READ when checking trivial response
query_result can be be an empty strbuf (STRBUF_INIT) - in that case
trying to read 3 bytes triggers a buffer overflow read (as
query_result.buf = '\0').

Therefore we need to check query_result's length before trying to read 3
bytes.

This overflow was introduced in:
  940b94f35c (fsmonitor: log invocation of FSMonitor hook to trace2, 2021-02-03)
It was found when running the test-suite against ASAN, and can be most
easily reproduced with the following command:

make GIT_TEST_OPTS="-v" DEFAULT_TEST_TARGET="t7519-status-fsmonitor.sh" \
SANITIZE=address DEVELOPER=1 test

==2235==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0000019e6e5e at pc 0x00000043745c bp 0x7fffd382c520 sp 0x7fffd382bcc8
READ of size 3 at 0x0000019e6e5e thread T0
    #0 0x43745b in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:842:7
    #1 0x43786d in bcmp /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:887:10
    #2 0x80b146 in fsmonitor_is_trivial_response /home/ahunt/oss-fuzz/git/fsmonitor.c:192:10
    #3 0x80b146 in query_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:175:7
    #4 0x80a749 in refresh_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:267:21
    #5 0x80bad1 in tweak_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:429:4
    #6 0x90f040 in read_index_from /home/ahunt/oss-fuzz/git/read-cache.c:2321:3
    #7 0x8e5d08 in repo_read_index_preload /home/ahunt/oss-fuzz/git/preload-index.c:164:15
    #8 0x52dd45 in prepare_index /home/ahunt/oss-fuzz/git/builtin/commit.c:363:6
    #9 0x52a188 in cmd_commit /home/ahunt/oss-fuzz/git/builtin/commit.c:1588:15
    #10 0x4ce77e in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #11 0x4ccb18 in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #12 0x4cb01c in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #13 0x4cb01c in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #14 0x6aca8d in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #15 0x7fb027bf5349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #16 0x4206b9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120

0x0000019e6e5e is located 2 bytes to the left of global variable 'strbuf_slopbuf' defined in 'strbuf.c:51:6' (0x19e6e60) of size 1
  'strbuf_slopbuf' is ascii string ''
0x0000019e6e5e is located 126 bytes to the right of global variable 'signals' defined in 'sigchain.c:11:31' (0x19e6be0) of size 512
SUMMARY: AddressSanitizer: global-buffer-overflow /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:842:7 in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long)
Shadow bytes around the buggy address:
  0x000080334d70: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x000080334d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080334d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080334da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080334db0: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9
=>0x000080334dc0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9[f9]01 f9 f9 f9
  0x000080334dd0: f9 f9 f9 f9 03 f9 f9 f9 f9 f9 f9 f9 02 f9 f9 f9
  0x000080334de0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
  0x000080334df0: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x000080334e00: f9 f9 f9 f9 00 00 00 00 f9 f9 f9 f9 01 f9 f9 f9
  0x000080334e10: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 10:00:20 -07:00
1c57cc70ec cocci: allow xcalloc(1, size)
Allocating a pre-cleared single element is quite common and it is
misleading to use CALLOC_ARRAY(); these allocations that would be
affected without this change are not allocating an array.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 17:56:07 -07:00
486f4bd183 xcalloc: use CALLOC_ARRAY() when applicable
These are for codebase before Git 2.31

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 17:51:10 -07:00
9fd1902762 unix-stream-server: create unix domain socket under lock
Create a wrapper class for `unix_stream_listen()` that uses a ".lock"
lockfile to create the unix domain socket in a race-free manner.

Unix domain sockets have a fundamental problem on Unix systems because
they persist in the filesystem until they are deleted.  This is
independent of whether a server is actually listening for connections.
Well-behaved servers are expected to delete the socket when they
shutdown.  A new server cannot easily tell if a found socket is
attached to an active server or is leftover cruft from a dead server.
The traditional solution used by `unix_stream_listen()` is to force
delete the socket pathname and then create a new socket.  This solves
the latter (cruft) problem, but in the case of the former, it orphans
the existing server (by stealing the pathname associated with the
socket it is listening on).

We cannot directly use a .lock lockfile to create the socket because
the socket is created by `bind(2)` rather than the `open(2)` mechanism
used by `tempfile.c`.

As an alternative, we hold a plain lockfile ("<path>.lock") as a
mutual exclusion device.  Under the lock, we test if an existing
socket ("<path>") is has an active server.  If not, we create a new
socket and begin listening.  Then we use "rollback" to delete the
lockfile in all cases.

This wrapper code conceptually exists at a higher-level than the core
unix_stream_connect() and unix_stream_listen() routines that it
consumes.  It is isolated in a wrapper class for clarity.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
77e522caae unix-socket: disallow chdir() when creating unix domain sockets
Calls to `chdir()` are dangerous in a multi-threaded context.  If
`unix_stream_listen()` or `unix_stream_connect()` is given a socket
pathname that is too long to fit in a `sockaddr_un` structure, it will
`chdir()` to the parent directory of the requested socket pathname,
create the socket using a relative pathname, and then `chdir()` back.
This is not thread-safe.

Teach `unix_sockaddr_init()` to not allow calls to `chdir()` when this
flag is set.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
55144ccb0a unix-socket: add backlog size option to unix_stream_listen()
Update `unix_stream_listen()` to take an options structure to override
default behaviors.  This commit includes the size of the `listen()` backlog.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
4f98ce5865 unix-socket: eliminate static unix_stream_socket() helper function
The static helper function `unix_stream_socket()` calls `die()`.  This
is not appropriate for all callers.  Eliminate the wrapper function
and make the callers propagate the error.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
59c7b88198 simple-ipc: add win32 implementation
Create Windows implementation of "simple-ipc" using named pipes.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
066d5234d0 simple-ipc: design documentation for new IPC mechanism
Brief design documentation for new IPC mechanism allowing
foreground Git client to talk with an existing daemon process
at a known location using a named pipe or unix domain socket.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
8c2efa5d76 pkt-line: add options argument to read_packetized_to_strbuf()
Update the calling sequence of `read_packetized_to_strbuf()` to take
an options argument and not assume a fixed set of options.  Update the
only existing caller accordingly to explicitly pass the
formerly-assumed flags.

The `read_packetized_to_strbuf()` function calls `packet_read()` with
a fixed set of assumed options (`PACKET_READ_GENTLE_ON_EOF`).  This
assumption has been fine for the single existing caller
`apply_multi_file_filter()` in `convert.c`.

In a later commit we would like to add other callers to
`read_packetized_to_strbuf()` that need a different set of options.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
c4ba579397 pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
Introduce PACKET_READ_GENTLE_ON_READ_ERROR option to help libify the
packet readers.

So far, the (possibly indirect) callers of `get_packet_data()` can ask
that function to return an error instead of `die()`ing upon end-of-file.
However, random read errors will still cause the process to die.

So let's introduce an explicit option to tell the packet reader
machinery to please be nice and only return an error on read errors.

This change prepares pkt-line for use by long-running daemon processes.
Such processes should be able to serve multiple concurrent clients and
and survive random IO errors.  If there is an error on one connection,
a daemon should be able to drop that connection and continue serving
existing and future connections.

This ability will be used by a Git-aware "Builtin FSMonitor" feature
in a later patch series.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
3a63c6a48c pkt-line: do not issue flush packets in write_packetized_*()
Remove the `packet_flush_gently()` call in `write_packetized_from_buf() and
`write_packetized_from_fd()` and require the caller to call it if desired.
Rename both functions to `write_packetized_from_*_no_flush()` to prevent
later merge accidents.

`write_packetized_from_buf()` currently only has one caller:
`apply_multi_file_filter()` in `convert.c`.  It always wants a flush packet
to be written after writing the payload.

However, we are about to introduce a caller that wants to write many
packets before a final flush packet, so let's make the caller responsible
for emitting the flush packet.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
7455e05e4e pkt-line: eliminate the need for static buffer in packet_write_gently()
Teach `packet_write_gently()` to write the pkt-line header and the actual
buffer in 2 separate calls to `write_in_full()` and avoid the need for a
static buffer, thread-safe scratch space, or an excessively large stack
buffer.

Change `write_packetized_from_fd()` to allocate a temporary buffer rather
than using a static buffer to avoid similar issues here.

These changes are intended to make it easier to use pkt-line routines in
a multi-threaded context with multiple concurrent writers writing to
different streams.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
00ea64ed7a doc/git-commit: add documentation for fixup=[amend|reword] options
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:36 -07:00
8bedae4599 t3437: use --fixup with options to create amend! commit
We taught `git commit --fixup` to create "amend!" commit. Let's also
update the tests and use it to setup the rebase tests.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:36 -07:00
3d1bda6b5b t7500: add tests for --fixup=[amend|reword] options
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
3270ae82ac commit: add a reword suboption to --fixup
`git commit --fixup=reword:<commit>` aliases
`--fixup=amend:<commit> --only`, where it creates an empty "amend!"
commit that will reword <commit> without changing its contents when
it is rebased with `--autosquash`.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
494d314a05 commit: add amend suboption to --fixup to create amend! commit
`git commit --fixup=amend:<commit>` will create an "amend!" commit.
The resulting commit message subject will be "amend! ..." where
"..." is the subject line of <commit> and the initial message
body will be <commit>'s message.

The "amend!" commit when rebased with --autosquash will fixup the
contents and replace the commit message of <commit> with the
"amend!" commit's message body.

In order to prevent rebase from creating commits with an empty
message we refuse to create an "amend!" commit if commit message
body is empty.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
6e0e288779 sequencer: export and rename subject_length()
This function can be used in other parts of git. Let's move the
function to commit.c and also rename it to make the name of the
function more generic.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
a5828ae6b5 Git 2.31
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 11:51:51 -07:00
8775279891 Merge branch 'jn/mergetool-hideresolved-is-optional'
Disable the recent mergetool's hideresolved feature by default for
backward compatibility and safety.

* jn/mergetool-hideresolved-is-optional:
  doc: describe mergetool configuration in git-mergetool(1)
  mergetool: do not enable hideResolved by default
2021-03-14 16:01:41 -07:00
074d162eff Merge branch 'tb/pack-revindex-on-disk'
Fix for a topic in 'master'.

* tb/pack-revindex-on-disk:
  pack-revindex.c: don't close unopened file descriptors
2021-03-14 16:01:41 -07:00
04fe4d75fa init-db: silence template_dir leak when converting to absolute path
template_dir starts off pointing to either argv or nothing. However if
the value supplied in argv is a relative path, absolute_pathdup() is
used to turn it into an absolute path. absolute_pathdup() allocates
a new string, and we then "leak" it when cmd_init_db() completes.

We don't bother to actually free the return value (instead we UNLEAK
it), because there's no significant advantage to doing so here.
Correctly freeing it would require more significant changes to code flow
which would be more noisy than beneficial.

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:58:00 -07:00
e4de4502e6 init: remove git_init_db_config() while fixing leaks
The primary goal of this change is to stop leaking init_db_template_dir.
This leak can happen because:
 1. git_init_db_config() allocates new memory into init_db_template_dir
    without first freeing the existing value.
 2. init_db_template_dir might already contain data, either because:
  2.1 git_config() can be invoked twice with this callback in a single
      process - at least 2 allocations are likely.
  2.2 A single git_config() allocation can invoke the callback multiple
      times for a given key (see further explanation in the function
      docs) - each of those calls will trigger another leak.

The simplest fix for the leak would be to free(init_db_template_dir)
before overwriting it. Instead we choose to convert to fetching
init.templatedir via git_config_get_value() as that is more explicit,
more efficient, and avoids allocations (the returned result is owned by
the config cache, so we aren't responsible for freeing it).

If we remove init_db_template_dir, git_init_db_config() ends up being
responsible only for forwarding core.* config values to
platform_core_config(). However platform_core_config() already ignores
non-core.* config values, so we can safely remove git_init_db_config()
and invoke git_config() directly with platform_core_config() as the
callback.

The platform_core_config forwarding was originally added in:
  287853392a (mingw: respect core.hidedotfiles = false in git-init again, 2019-03-11
And I suspect the potential for a leak existed since the original
implementation of git_init_db_config in:
  90b45187ba (Add `init.templatedir` configuration variable., 2010-02-17)

LSAN output from t0001:

Direct leak of 73 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9a7276 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x9362ad in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x936eaa in strbuf_add /home/ahunt/oss-fuzz/git/strbuf.c:295:2
    #4 0x868112 in strbuf_addstr /home/ahunt/oss-fuzz/git/./strbuf.h:304:2
    #5 0x86a8ad in expand_user_path /home/ahunt/oss-fuzz/git/path.c:758:2
    #6 0x720bb1 in git_config_pathname /home/ahunt/oss-fuzz/git/config.c:1287:10
    #7 0x5960e2 in git_init_db_config /home/ahunt/oss-fuzz/git/builtin/init-db.c:161:11
    #8 0x7255b8 in configset_iter /home/ahunt/oss-fuzz/git/config.c:1982:7
    #9 0x7253fc in repo_config /home/ahunt/oss-fuzz/git/config.c:2311:2
    #10 0x725ca7 in git_config /home/ahunt/oss-fuzz/git/config.c:2399:2
    #11 0x593e8d in create_default_files /home/ahunt/oss-fuzz/git/builtin/init-db.c:225:2
    #12 0x5935c6 in init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:449:11
    #13 0x59588e in cmd_init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:714:9
    #14 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #15 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #16 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #17 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #18 0x69c4de in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #19 0x7f23552d6349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
aa1b63971a worktree: fix leak in dwim_branch()
Make sure that we release the temporary strbuf during dwim_branch() for
all codepaths (and not just for the early return).

This leak appears to have been introduced in:
  f60a7b763f (worktree: teach "add" to check out existing branches, 2018-04-24)

Note that UNLEAK(branchname) is still needed: the returned result is
used in add(), and is stored in a pointer which is used to point at one
of:
  - a string literal ("HEAD")
  - member of argv (whatever the user specified in their invocation)
  - or our newly allocated string returned from dwim_branch()
Fixing the branchname leak isn't impossible, but does not seem
worthwhile given that add() is called directly from cmd_main(), and
cmd_main() returns immediately thereafter - UNLEAK is good enough.

This leak was found when running t0001 with LSAN, see also LSAN output
below:

Direct leak of 60 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ab076 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x939fcd in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x93af53 in strbuf_splice /home/ahunt/oss-fuzz/git/strbuf.c:239:3
    #4 0x83559a in strbuf_check_branch_ref /home/ahunt/oss-fuzz/git/object-name.c:1593:2
    #5 0x6988b9 in dwim_branch /home/ahunt/oss-fuzz/git/builtin/worktree.c:454:20
    #6 0x695f8f in add /home/ahunt/oss-fuzz/git/builtin/worktree.c:525:19
    #7 0x694a04 in cmd_worktree /home/ahunt/oss-fuzz/git/builtin/worktree.c:1036:10
    #8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #12 0x69caee in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #13 0x7f7b7dd10349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
0c4542738e clone: free or UNLEAK further pointers when finished
Most of these pointers can safely be freed when cmd_clone() completes,
therefore we make sure to free them. The one exception is that we
have to UNLEAK(repo) because it can point either to argv[0], or a
malloc'd string returned by absolute_pathdup().

We also have to free(path) in the middle of cmd_clone(): later during
cmd_clone(), path is unconditionally overwritten with a different path,
triggering a leak. Freeing the first path immediately after use (but
only in the case where it contains data) seems like the cleanest
solution, as opposed to freeing it unconditionally before path is reused
for another path. This leak appears to have been introduced in:
  f38aa83f9a (use local cloning if insteadOf makes a local URL, 2014-07-17)

These leaks were found when running t0001 with LSAN, see also an excerpt
of the LSAN output below (the full list is omitted because it's far too
long, and mostly consists of indirect leakage of members of the refs we
are freeing).

Direct leak of 178 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
    #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6fc4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6f9a in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce266 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x51e9bd in wanted_peer_refs /home/ahunt/oss-fuzz/git/builtin/clone.c:574:21
    #5 0x51cfe1 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1284:17
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c42e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f8fef0c2349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 178 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
    #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
    #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
    #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
    #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
    #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
    #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
    #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
    #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
    #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
    #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 105 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9a71f6 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x93622d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x937a73 in strbuf_addch /home/ahunt/oss-fuzz/git/./strbuf.h:231:3
    #4 0x939fcd in strbuf_add_absolute_path /home/ahunt/oss-fuzz/git/strbuf.c:911:4
    #5 0x69d3ce in absolute_pathdup /home/ahunt/oss-fuzz/git/abspath.c:261:2
    #6 0x51c688 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1021:10
    #7 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #8 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #9 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #10 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #11 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #12 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
e901de6816 reset: free instead of leaking unneeded ref
dwim_ref() allocs a new string into ref. Instead of setting to NULL to
discard it, we can FREE_AND_NULL.

This leak appears to have been introduced in:
4cf76f6bbf (builtin/reset: compute checkout metadata for reset, 2020-03-16)

This leak was found when running t0001 with LSAN, see also LSAN output below:

Direct leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9a7108 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14
    #2 0x8add6b in expand_ref /home/ahunt/oss-fuzz/git/refs.c:670:12
    #3 0x8ad777 in repo_dwim_ref /home/ahunt/oss-fuzz/git/refs.c:644:22
    #4 0x6394af in dwim_ref /home/ahunt/oss-fuzz/git/./refs.h:162:9
    #5 0x637e5c in cmd_reset /home/ahunt/oss-fuzz/git/builtin/reset.c:426:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c5ce in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f57ebb9d349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
f63b88867a symbolic-ref: don't leak shortened refname in check_symref()
shorten_unambiguous_ref() returns an allocated string. We have to
track it separately from the const refname.

This leak has existed since:
9ab55daa55 (git symbolic-ref --delete $symref, 2012-10-21)

This leak was found when running t0001 with LSAN, see also LSAN output
below:

Direct leak of 19 byte(s) in 1 object(s) allocated from:
    #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ab048 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14
    #2 0x8b452f in refs_shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c
    #3 0x8b47e8 in shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c:1287:9
    #4 0x679fce in check_symref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:28:14
    #5 0x679ad8 in cmd_symbolic_ref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:70:9
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69cc6e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f98388a4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
5be1c70518 Merge tag 'l10n-2.31.0-rnd2' of git://github.com/git-l10n/git-po
l10n for Git 2.31.0 round 2

* tag 'l10n-2.31.0-rnd2' of git://github.com/git-l10n/git-po:
  l10n: zh_CN: for git v2.31.0 l10n round 1 and 2
  l10n: de.po: Update German translation for Git v2.31.0
  l10n: pt_PT: add Portuguese translations part 1
  l10n: vi.po(5104t): for git v2.31.0 l10n round 2
  l10n: es: 2.31.0 round 2
  l10n: Add translation team info
  l10n: start Indonesian translation
  l10n: zh_TW.po: v2.31.0 round 2 (15 untranslated)
  l10n: bg.po: Updated Bulgarian translation (5104t)
  l10n: fr: v2.31 rnd 2
  l10n: tr: v2.31.0-rc1
  l10n: sv.po: Update Swedish translation (5104t0f0u)
  l10n: git.pot: v2.31.0 round 2 (9 new, 8 removed)
  l10n: tr: v2.31.0-rc0
  l10n: sv.po: Update Swedish translation (5103t0f0u)
  l10n: pl.po: Update translation
  l10n: fr: v2.31.0 rnd 1
  l10n: git.pot: v2.31.0 round 1 (155 new, 89 removed)
  l10n: Update Catalan translation
  l10n: ru.po: update Russian translation
2021-03-14 15:50:36 -07:00
8588aa8657 vcs-svn: remove header files as well
fc47391e24 (drop vcs-svn experiment, 2020-08-13) removed most vcs-svn
files.  Drop the remaining header files as well, as they are no longer
used.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:48:23 -07:00
473eb54151 l10n: zh_CN: for git v2.31.0 l10n round 1 and 2
Translate 161 new messages (5104t0f0u) for git 2.31.0.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-03-15 00:05:25 +08:00
4bc948a743 Merge branch 'master' of github.com:vnwildman/git
* 'master' of github.com:vnwildman/git:
  l10n: vi.po(5104t): for git v2.31.0 l10n round 2
2021-03-15 00:04:47 +08:00
e196890735 Merge branch 'l10n/zh_TW/210301' of github.com:l10n-tw/git-po
* 'l10n/zh_TW/210301' of github.com:l10n-tw/git-po:
  l10n: zh_TW.po: v2.31.0 round 2 (15 untranslated)
2021-03-14 22:35:44 +08:00
84bc81478e Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: Add translation team info
  l10n: start Indonesian translation
2021-03-14 22:35:17 +08:00
bd5fba827b Merge branch 'master' of github.com:Softcatala/git-po
* 'master' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2021-03-14 22:34:46 +08:00
2d897529b2 Merge branch 'russian-l10n' of github.com:DJm00n/git-po-ru
* 'russian-l10n' of github.com:DJm00n/git-po-ru:
  l10n: ru.po: update Russian translation
2021-03-14 22:34:12 +08:00
799df2e406 Merge branch 'pt-PT' of github.com:git-l10n-pt-PT/git-po
* 'pt-PT' of github.com:git-l10n-pt-PT/git-po:
  l10n: pt_PT: add Portuguese translations part 1
2021-03-14 22:33:26 +08:00
ca56dadb4b use CALLOC_ARRAY
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead.  It shortens the code and infers the
element size automatically.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 16:00:09 -08:00
f1121499e6 git-compat-util.h: drop trailing semicolon from macro definition
Make CALLOC_ARRAY usable like a function by requiring callers to supply
the trailing semicolon, which all of the current ones already do.  With
the extra semicolon e.g. the following code wouldn't compile because it
disconnects the "else" from the "if":

	if (condition)
		CALLOC_ARRAY(ptr, n);
	else
		whatever();

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 15:56:13 -08:00
4c8e3dca6e Documentation/git-push.txt: correct configuration typo
In the EXAMPLES section, git-push(1) says that 'git push origin' pushes
the current branch to the value of the 'remote.origin.merge'
configuration.

This wording (which dates back to b2ed944af7 (push: switch default from
"matching" to "simple", 2013-01-04)) is incorrect. There is no such
configuration as 'remote.<name>.merge'. This likely was originally
intended to read "branch.<name>.merge" instead.

Indeed, when 'push.default' is 'simple' (which is the default value, and
is applicable in this scenario per "without additional configuration"),
setup_push_upstream() dies if the branch's local name does not match
'branch.<name>.merge'.

Correct this long-standing typo to resolve some recent confusion on the
intended behavior of this example.

Reported-by: Adam Sharafeddine <adam.shrfdn@gmail.com>
Reported-by: Fabien Terrani <terranifabien@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 15:41:45 -08:00
53204061ac doc: describe mergetool configuration in git-mergetool(1)
In particular, this describes mergetool.hideResolved, which can help
users discover this setting (either because it may be useful to them
or in order to understand mergetool's behavior if they have forgotten
setting it in the past).

Tested by running

	make -C Documentation git-mergetool.1
	man Documentation/git-mergetool.1

and reading through the page.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 15:34:32 -08:00
b2a51c1b03 mergetool: do not enable hideResolved by default
When 98ea309b3f (mergetool: add hideResolved configuration,
2021-02-09) introduced the mergetool.hideResolved setting to reduce
the clutter in viewing non-conflicted sections of files in a
mergetool, it enabled it by default, explaining:

    No adverse effects were noted in a small survey of popular mergetools[1]
    so this behavior defaults to `true`.

In practice, alas, adverse effects do appear.  A few issues:

1. No indication is shown in the UI that the base, local, and remote
   versions shown have been modified by additional resolution.  This
   is inherent in the design: the idea of mergetool.hideResolved is to
   convince a mergetool that expects pristine local, base, and remote
   files to show partially resolved verisons of those files instead;
   there is no additional source of information accessible to the
   mergetool to see where the resolution has happened.

   (By contrast, a mergetool generating the partial resolution from
   conflict markers for itself would be able to hilight the resolved
   sections with a different color.)

   A user accustomed to seeing the files without partial resolution
   gets no indication that this behavior has changed when they upgrade
   Git.

2. If the computed merge did not line up the files correctly (for
   example due to repeated sections in the file), the partially
   resolved files can be misleading and do not have enough information
   to reconstruct what happened and compute the correct merge result.

3. Resolving a conflict can involve information beyond the textual
   conflict.  For example, if the local and remote versions added
   overlapping functionality in different ways, seeing the full
   unresolved versions of each alongside the base gives information
   about each side's intent that makes it possible to come up with a
   resolution that combines those two intents.  By contrast, when
   starting with partially resolved versions of those files, one can
   produce a subtly wrong resolution that includes redundant extra
   code added by one side that is not needed in the approach taken
   on the other.

All that said, a user wanting to focus on textual conflicts with
reduced clutter can still benefit from mergetool.hideResolved=true as
a way to deemphasize sections of the code that resolve cleanly without
requiring any changes to the invoked mergetool.  The caveats described
above are reduced when the user has explicitly turned this on, because
then the user is aware of them.

Flip the default to 'false'.

Reported-by: Dana Dahlstrom <dahlstrom@google.com>
Helped-by: Seth House <seth@eseth.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 15:30:29 -08:00
a4a4439fdf http: drop the check for an empty proxy password before approving
credential_approve() already checks for a non-empty password before
saving, so there's no need to do the extra check here.

Signed-off-by: John Szakmeister <john@szakmeister.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-11 22:17:10 -08:00
cd27f604e4 http: store credential when PKI auth is used
We already looked for the PKI credentials in the credential store, but
failed to approve it on success.  Meaning, the PKI certificate password
was never stored and git would request it on every connection to the
remote.  Let's complete the chain by storing the certificate password on
success.

Likewise, we also need to reject the credential when there is a failure.
Curl appears to report client-related certificate issues are reported
with the CURLE_SSL_CERTPROBLEM error.  This includes not only a bad
password, but potentially other client certificate related problems.
Since we cannot get more information from curl, we'll go ahead and
reject the credential upon receiving that error, just to be safe and
avoid caching or saving a bad password.

Signed-off-by: John Szakmeister <john@szakmeister.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-11 22:17:07 -08:00
96099726dd archive: expand only a single %(describe) per archive
Every %(describe) placeholder in $Format:...$ strings in files with the
attribute export-subst is expanded by calling git describe.  This can
potentially result in a lot of such calls per archive.  That's OK for
local repositories under control of the user of git archive, but could
be a problem for hosted repositories.

Expand only a single %(describe) placeholder per archive for now to
avoid denial-of-service attacks.  We can make this limit configurable
later if needed, but let's start out simple.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-11 13:22:44 -08:00
e4fd06e7e2 diffcore-rename: avoid doing basename comparisons for irrelevant sources
The basename comparison optimization implemented in
find_basename_matches() is very beneficial since it allows a source to
sometimes only be compared with one other file instead of N other files.
When a match is found, both a source and destination can be removed from
the matrix of inexact rename comparisons.  In contrast, the irrelevant
source optimization only allows us to remove a source from the matrix of
inexact rename comparisons...but it has the advantage of allowing a
source file to not even be loaded into memory at all and be compared to
0 other files.  Generally, not even comparing is a bigger performance
win, so when both optimizations could apply, prefer to use the
irrelevant-source optimization.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        5.708 s ±  0.111 s     5.680 s ±  0.096 s
    mega-renames:    102.171 s ±  0.440 s    13.812 s ±  0.162 s
    just-one-mega:     3.471 s ±  0.015 s   506.0  ms ±  3.9  ms

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
f89b4f2bee merge-ort: skip rename detection entirely if possible
diffcore_rename_extended() will do a bunch of setup, then check for
exact renames, then abort before inexact rename detection if there are
no more sources or destinations that need to be matched.  It will
sometimes be the case, however, that either
  * we start with neither any sources or destinations
  * we start with no *relevant* sources
In the first of these two cases, the setup and exact rename detection
will be very cheap since there are 0 files to operate on.  In the second
case, it is quite possible to have thousands of files with none of the
source ones being relevant.  Avoid calling diffcore_rename_extended() or
even some of the setup before diffcore_rename_extended() when we can
determine that rename detection is unnecessary.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        6.003 s ±  0.048 s     5.708 s ±  0.111 s
    mega-renames:    114.009 s ±  0.236 s   102.171 s ±  0.440 s
    just-one-mega:     3.489 s ±  0.017 s     3.471 s ±  0.015 s

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
174791f0fb merge-ort: use relevant_sources to filter possible rename sources
The past several commits determined conditions when rename sources might
be needed, and filled a relevant_sources strset with those paths.  Pass
these along to diffcore_rename_extended() to use to limit the sources
that we need to detect renames for.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       12.596 s ±  0.061 s     6.003 s ±  0.048 s
    mega-renames:    130.465 s ±  0.259 s   114.009 s ±  0.236 s
    just-one-mega:     3.958 s ±  0.010 s     3.489 s ±  0.017 s

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
2fd9eda462 merge-ort: precompute whether directory rename detection is needed
The point of directory rename detection is that if one side of history
renames a directory, and the other side adds new files under the old
directory, then the merge can move those new files into the new
directory.  This leads to the following important observation:

  * If the other side does not add any new files under the old
    directory, we do not need to detect any renames for that directory.

Similarly, directory rename detection had an important requirement:

  * If a directory still exists on one side of history, it has not been
    renamed on that side of history.  (See section 4 of t6423 or
    Documentation/technical/directory-rename-detection.txt for more
    details).

Using these two bits of information, we note that directory rename
detection is only needed in cases where (1) directories exist in the
merge base and on one side of history (i.e. dirmask == 3 or dirmask ==
5), and (2) where there is some new file added to that directory on the
side where it still exists (thus where the file has filemask == 2 or
filemask == 4, respectively).  This has to be done in two steps, because
we have the dirmask when we are first considering the directory, and
won't get the filemasks for the files within it until we recurse into
that directory.  So, we save
  dir_rename_mask = dirmask - 1
when we hit a directory that is missing on one side, and then later look
for cases of
  filemask == dir_rename_mask

One final note is that as soon as we hit a directory that needs
directory rename detection, we will need to detect renames in all
subdirectories of that directory as well due to the "majority rules"
decision when files are renamed into different directory hierarchies.
We arbitrarily use the special value of 0x07 to record when we've hit
such a directory.

The combination of all the above mean that we introduce a variable
named dir_rename_mask (couldn't think of a better name) which has one
of the following values as we traverse into a directory:
   * 0x00: directory rename detection not needed
   * 0x02 or 0x04: directory rename detection only needed if files added
   * 0x07: directory rename detection definitely needed

We then pass this value through to add_pairs() so that it can mark
location_relevant as true only when dir_rename_mask is 0x07.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
a68e6cea59 merge-ort: introduce wrappers for alternate tree traversal
Add traverse_trees_wrapper() and traverse_trees_wrapper_callback()
functions.  The former runs traverse_trees() with info->fn set to
traverse_trees_wrapper_callback, in order to simply save all the entries
without processing or recursing into any of them.  This step allows
extra computation to be done (e.g. checking some condition across all
files) that can be used later.  Then, after that is completed, it
iterates over all the saved entries and calls the original info->fn
callback with the saved data.

Currently, this does nothing more than marginally slowing down the tree
traversal since we do not take advantage of the opportunity to compute
anything special in traverse_trees_wrapper_callback(), and thus the real
callback will be called identically as it would have been without this
extra wrapper.  However, a subsequent commit will add some special
computation of some values that the real callback will be able to use.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
beb06145f8 merge-ort: add data structures for an alternate tree traversal
In order to determine whether directory rename detection is needed, we
as a pre-requisite need a way to traverse through all the files in a
given tree before visiting any directories within that tree.
traverse_trees() only iterates through the entries in the order they
appear, so add some data structures that will store all the entries as
we iterate through them in traverse_trees(), which will allow us to
re-traverse them in our desired order.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:04 -08:00
32a56dfb99 merge-ort: precompute subset of sources for which we need rename detection
rename detection works by trying to pair all file deletions (or
"sources") with all file additions (or "destinations"), checking
similarity, and then marking the sufficiently similar ones as renames.
This can be expensive if there are many sources and destinations on a
given side of history as it results in an N x M comparison matrix.
However, there are many cases where we can compute in advance that
detecting renames for some of the sources provides no useful information
and thus that we can exclude those sources from the matrix.

To see why, first note that the merge machinery uses detected renames in
two ways:

   * directory rename detection: when one side of history renames a
       directory, and the other side of history adds new files to that
       directory, we want to be able to warn the user about the need to
       chose whether those new files stay in the old directory or move
       to the new one.

   * three-way content merging: in order to do three-way content merging
       of files, we need three different file versions.  If one side of
       history renamed a file, then some of the content for the file is
       found under a different path than in the merge base or on the
       other side of history.

Add a simple testcase showing the two kinds of reasons renames are
relevant; it's a testcase that will only pass if we detect both kinds of
needed renames.

Other than the testcase added above, this commit concentrates just on
the three-way content merging; it will punt and mark all sources as
needed for directory rename detection, and leave it to future commits to
narrow that down more.

The point of three-way content merging is to reconcile changes made on
*both* sides of history.  What if the file wasn't modified on both
sides?  There are two possibilities:

   * If it wasn't modified on the renamed side:
       -> then we get to do exact rename detection, which is cheap.

   * If it wasn't modified on the unrenamed side:
       -> then detection of a rename for that source file is irrelevant

That latter claim might be surprising at first, so let's walk through a
case to show why rename detection for that source file is irrelevant.
Let's use two filenames, old.c & new.c, with the following abbreviated
object ids (and where the value '000000' is used to denote that the file
is missing in that commit):

                 old.c     new.c
   MERGE_BASE:   01d01d    000000
   MERGE_SIDE1:  01d01d    000000
   MERGE_SIDE2:  000000    5e1ec7

If the rename *isn't* detected:
   then old.c looks like it was unmodified on one side and deleted on
   the other and should thus be removed.  new.c looks like a new file we
   should keep as-is.

If the rename *is* detected:
   then a three-way content merge is done.  Since the version of the
   file in MERGE_BASE and MERGE_SIDE1 are identical, the three-way merge
   will produce exactly the version of the file whose abbreviated
   object id is 5e1ec7.  It will record that file at the path new.c,
   while removing old.c from the directory.

Note that these two results are identical -- a single file named 'new.c'
with object id 5e1ec7.  In other words, it doesn't matter if the rename
is detected in the case where the file is unmodified on the unrenamed
side.

Use this information to compute whether we need rename detection for
each source created in add_pair().

It's probably worth noting that there used to be a few other edge or
corner cases besides three-way content merges and directory rename
detection where lack of rename detection could have affected the result,
but those cases actually highlighted where conflict resolution methods
were not consistent with each other.  Fixing those inconsistencies were
thus critically important to enabling this optimization.  That work
involved the following:

 * bringing consistency to add/add, rename/add, and rename/rename
    conflict types, as done back in the topic merged at commit
    ac193e0e0a ("Merge branch 'en/merge-path-collision'", 2019-01-04),
    and further extended in commits 2a7c16c980 ("t6422, t6426: be more
    flexible for add/add conflicts involving renames", 2020-08-10) and
    e8eb99d4a6 ("t642[23]: be more flexible for add/add conflicts
    involving pair renames", 2020-08-10)

  * making rename/delete more consistent with modify/delete
    as done in commits 1f3c9ba707 ("t6425: be more flexible with
    rename/delete conflict messages", 2020-08-10) and 727c75b23f
    ("t6404, t6423: expect improved rename/delete handling in ort
    backend", 2020-10-26)

Since the set of relevant_sources we compute has not yet been narrowed
down for directory rename detection, we do not pass it to
diffcore_rename_extended() yet.  That will be done after subsequent
commits narrow down the list of relevant_sources needed for directory
rename detection reasons.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:04 -08:00
9799889f2e diffcore-rename: enable filtering possible rename sources
Add the ability to diffcore_rename_extended() to allow external callers
to declare that they only need renames detected for a subset of source
files, and use that information to skip detecting renames for them.

There are two important pieces to this optimization that may not be
obvious at first glance:

  * We do not require callers to just filter the filepairs out
    to remove the non-relevant sources, because exact rename detection
    is fast and when it finds a match it can remove both a source and a
    destination whereas the relevant_sources filter can only remove a
    source.

  * We need to filter out the source pairs in a preliminary pass instead
    of adding a
       strset_contains(relevant_sources, one->path)
    check within the nested matrix loop.  The reason for that is if we
    have 30k renames, doing 30k * 30k = 900M strset_contains() calls
    becomes extraordinarily expensive and defeats the performance gains
    from this change; we only want to do 30k such calls instead.

If callers pass NULL for relevant_sources, that is special cases to
treat all sources as relevant.  Since all callers currently pass NULL,
this optimization does not yet have any effect.  Subsequent commits will
have merge-ort compute a set of relevant_sources to restrict which
sources we detect renames for, and have merge-ort pass that set of
relevant_sources to diffcore_rename_extended().

A note about filtering order:

Some may be curious why we don't filter out irrelevant sources at the
same time we filter out exact renames.  While that technically could be
done at this point, there are two reasons to defer it:

First, was to reinforce a lesson that was too easy to forget.  As I
mentioned above, in the past I filtered irrelevant sources out before
exact rename checking, and then discovered that exact renames' ability
to remove both sources and destinations was an important consideration
and thus doing the filtering after exact rename checking would speed
things up.  Then at some point I realized that basename matching could
also remove both sources and destinations, and decided to put irrelevant
source filtering after basename filtering.  That slowed things down a
lot.  But, despite learning about this important ordering, in later
restructuring I forgot and made the same mistake of putting the
filtering after basename guided rename detection again.  So, I have this
series of patches structured to do the irrelevant filtering last to
start to show how much extra it costs, and then add relevant filtering
in to find_basename_matches() to show how much it speeds things up.
Basically, it's a way to reinforce something that apparently was too
easy to forget, and make sure the commit messages record this lesson.

Second, the items in the "relevant_sources" in this patch series will
include all sources that *might be* relevant.  It has to be conservative
and catch anything that might need a rename, but in the patch series
after this one we'll find ways to weed out more of the *might be*
relevant ones.  Unfortunately, merge-ort does not have sufficient
information to weed those ones out, and there isn't enough information
at the time of filtering exact renames out to remove the extra ones
either.  It has to be deferred.  So the deferral is in part to simplify
some later additions.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:04 -08:00
75555676ad builtin/init-db: handle bare clones when core.bare set to false
In 552955ed7f ("clone: use more conventional config/option layering",
2020-10-01), clone learned to read configuration options earlier in its
execution, before creating the new repository.  However, that led to a
problem: if the core.bare setting is set to false in the global config,
cloning a bare repository segfaults.  This happens because the
repository is falsely thought to be non-bare, but clone has set the work
tree to NULL, which is then dereferenced.

The code to initialize the repository already considers the fact that a
user might want to override the --bare option for git init, but it
doesn't take into account clone, which uses a different option.  Let's
just check that the work tree is not NULL, since that's how clone
indicates that the repository is bare.  This is also the case for git
init, so we won't be regressing that case.

Reported-by: Joseph Vusich <jvusich@amazon.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 15:06:48 -08:00
42efa1231a filter-branch: drop $_x40 glob
When checking whether a commit was rewritten to a single object id, we
use a glob that insists on a 40-hex result. This works for sha1, but
fails t7003 when run with GIT_TEST_DEFAULT_HASH=sha256.

Since the previous commit simplified the case statement here, we only
have two arms: an empty string or a single object id. We can just loosen
our glob to match anything, and still distinguish those cases (we lose
the ability to notice bogus input, but that's not a problem; we are the
one who wrote the map in the first place, and anyway update-ref will
complain loudly if the input isn't a valid hash).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 14:16:58 -08:00
98fe9e666f filter-branch: drop multiple-ancestor warning
When a ref maps to a commit that is neither rewritten nor kept by
filter-branch (e.g., because it was eliminated by rev-list's pathspec
selection), we rewrite it to its nearest ancestor.

Since the initial commit in 6f6826c52b (Add git-filter-branch,
2007-06-03), we have warned when there are multiple such ancestors in
the map file. However, the warning code is impossible to trigger these
days. Since a0e46390d3 (filter-branch: fix ref rewriting with
--subdirectory-filter, 2008-08-12), we find the ancestor using "rev-list
-1", so it can only ever have a single value.

This code is made doubly confusing by the fact that we append to the map
file when mapping ancestors. However, this can never yield multiple
values because:

  - we explicitly check whether the map already exists, and if so, do
    nothing (so our "append" will always be to a file that does not
    exist)

  - even if we were to try mapping twice, the process to do so is
    deterministic. I.e., we'd always end up with the same ancestor for a
    given sha1. So warning about it would be pointless; there is no
    ambiguity.

So swap out the warning code for a BUG (which we'll simplify further in
the next commit). And let's stop using the append operator to make the
ancestor-mapping code less confusing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 14:14:52 -08:00
6d875d19fd t7003: test ref rewriting explicitly
After it has rewritten all of the commits, filter-branch will then
rewrite each of the input refs based on the resulting map of old/new
commits. But we don't have any explicit test coverage of this code.
Let's make sure we are covering each of those cases:

  - deleting a ref when all of its commits were pruned

  - rewriting a ref based on the mapping (this happens throughout the
    script, but let's make sure we generate the correct messages)

  - rewriting a ref whose tip was excluded, in which case we rewrite to
    the nearest ancestor. Note in this case that we still insist that no
    "warning" line is present (even though it looks like we'd trigger
    the "... was rewritten into multiple commits" one). See the next
    commit for more details.

Note these all pass currently, but the latter two will fail when run
with GIT_TEST_DEFAULT_HASH=sha256.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 14:14:19 -08:00
13d7ab6b5d Git 2.31-rc2 2021-03-08 16:09:43 -08:00
56a57652ef Sync with Git 2.30.2 for CVE-2021-21300
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 16:09:07 -08:00
6c46f864e5 Merge branch 'jt/transfer-fsck-across-packs-fix'
The code to fsck objects received across multiple packs during a
single git fetch session has been broken when the packfile URI
feature was in use.  A workaround has been added by disabling the
codepath to avoid keeping a packfile that is too small.

* jt/transfer-fsck-across-packs-fix:
  fetch-pack: do not mix --pack_header and packfile uri
2021-03-08 16:04:47 -08:00
834845142d l10n: de.po: Update German translation for Git v2.31.0
Reviewed-by: Ralf Thielow <ralf.thielow@gmail.com>
Reviewed-by: Phillip Szelat <phillip.szelat@gmail.com>
Signed-off-by: Matthias Rüster <matthias.ruester@gmail.com>
2021-03-08 19:49:33 +01:00
68b5c3aa48 Makefile: update 'make fuzz-all' docs to reflect modern clang
Clang no longer produces a libFuzzer.a. Instead, you can include
libFuzzer by using -fsanitize=fuzzer. Therefore we should use that in
the example command for building fuzzers.

We also add -fsanitize=fuzzer-no-link to the CFLAGS to ensure that all
the required instrumentation is added when compiling git [1], and remove
 -fsanitize-coverage=trace-pc-guard as it is deprecated.

I happen to have tested with LLVM 11 - however -fsanitize=fuzzer appears
to work in a wide range of reasonably modern clangs.

(On my system: what used to be libFuzzer.a now lives under the following
 path, which is tricky albeit not impossible for a novice such as myself
 to find:
/usr/lib64/clang/11.0.0/lib/linux/libclang_rt.fuzzer-x86_64.a )

[1] https://releases.llvm.org/11.0.0/docs/LibFuzzer.html#fuzzer-usage

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 10:26:25 -08:00
e8df3b6c6c Add entry for Ramkumar Ramachandra
Signed-off-by: Ramkumar Ramachandra <r@artagnon.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 09:56:34 -08:00
241b5d3ebe fix xcalloc() argument order
Pass the number of elements first and ther size second, as expected
by xcalloc().  Provide a semantic patch, which was actually used to
generate the rest of this patch.

The semantic patch would generate flip-flop diffs if both arguments
are sizeofs.  We don't have such a case, and it's hard to imagine
the usefulness of such an allocation.  If it ever occurs then we
could deal with it by duplicating the rule in the semantic patch to
make it cancel itself out, or we could change the code to use
CALLOC_ARRAY.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 09:45:04 -08:00
408985d301 l10n: pt_PT: add Portuguese translations part 1
* Newlines corrected.
* Add concept translation table.
* Translated some.
* Corrected some.
* Corrected some 'Negation of Emptiness'.

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-03-08 15:21:51 +00:00
1369935987 l10n: vi.po(5104t): for git v2.31.0 l10n round 2
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2021-03-08 09:03:04 +07:00
b0adcc311b l10n: es: 2.31.0 round 2
Signed-off-by: Christopher Diaz Riveros <christopher.diaz.riv@gmail.com>
2021-03-07 18:31:14 -05:00
c21ad4d941 l10n: Add translation team info
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-03-07 19:38:29 +07:00
8c4abfb8be l10n: start Indonesian translation
* Initialize PO file
  * Translate init-db.c
  * Translate wt-status.c
  * Translate builtin/clone.c
  * Translate builtin/checkout.c
  * Translate builtin/fetch.c
  * Complete core translations:
    * builtin/remote.c
    * builtin/index-pack.c
    * push.c
    * reset.c
  * Sync with l10n upstream

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-03-07 19:38:07 +07:00
2aec3bc4b6 fetch-pack: do not mix --pack_header and packfile uri
When fetching (as opposed to cloning) from a repository with packfile
URIs enabled, an error like this may occur:

 fatal: pack has bad object at offset 12: unknown object type 5
 fatal: finish_http_pack_request gave result -1
 fatal: fetch-pack: expected keep then TAB at start of http-fetch output

This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs,
use index-pack arg", 2021-02-22), when the index-pack args used when
processing the inline packfile of a fetch response and when processing
packfile URIs were unified.

This bug happens because fetch, by default, partially reads (and
consumes) the header of the inline packfile to determine if it should
store the downloaded objects as a packfile or loose objects, and thus
passes --pack_header=<...> to index-pack to inform it that some bytes
are missing. However, when it subsequently fetches the additional
packfiles linked by URIs, it reuses the same index-pack arguments, thus
wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are
missing.

This does not happen when cloning because "git clone" always passes
do_keep, which instructs the fetch mechanism to always retain the
packfile, eliminating the need to read the header.

There are a few ways to fix this, including filtering out pack_header
arguments when downloading the additional packfiles, but I decided to
stick to always using index-pack throughout when packfile URIs are
present - thus, Git no longer needs to read the bytes, and no longer
needs --pack_header here.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 15:04:09 -08:00
0af760e261 stash show: learn stash.showIncludeUntracked
The previous commit teaches `git stash show --include-untracked`. It
may be desirable for a user to be able to always enable the
--include-untracked behavior. Teach the stash.showIncludeUntracked
config option which allows users to do this in a similar manner to
stash.showPatch.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 14:31:27 -08:00
d3c7bf73bd stash show: teach --include-untracked and --only-untracked
Stash entries can be made with untracked files via
`git stash push --include-untracked`. However, because the untracked
files are stored in the third parent of the stash entry and not the
stash entry itself, running `git stash show` does not include the
untracked files as part of the diff.

With --include-untracked, untracked paths, which are recorded in the
third-parent if it exists, are shown in addition to the paths that have
modifications between the stash base and the working tree in the stash.

It is possible to manually craft a malformed stash entry where duplicate
untracked files in the stash entry will mask tracked files. We detect
and error out in that case via a custom unpack_trees() callback:
stash_worktree_untracked_merge().

Also, teach stash the --only-untracked option which only shows the
untracked files of a stash entry. This is similar to `git show stash^3`
but it is nice to provide a convenient abstraction for it so that users
do not have to think about the underlying implementation.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 14:31:26 -08:00
ccae01cab8 builtin/repack.c: reword comment around pack-objects flags
The comment in this block is meant to indicate that passing '--all',
'--reflog', and so on aren't necessary when repacking with the
'--geometric' option.

But, it has two problems: first, it is factually incorrect ('--all' is
*not* incompatible with '--stdin-packs' as the comment suggests);
second, it is quite focused on the geometric case for a block that is
guarding against it.

Reword this comment to address both issues.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
2a15964128 builtin/repack.c: be more conservative with unsigned overflows
There are a number of places in the geometric repack code where we
multiply the number of objects in a pack by another unsigned value. We
trust that the number of objects in a pack is always representable by a
uint32_t, but we don't necessarily trust that that number can be
multiplied without overflow.

Sprinkle some unsigned_add_overflows() and unsigned_mult_overflows() in
split_pack_geometry() to check that we never overflow any unsigned types
when adding or multiplying them.

Arguably these checks are a little too conservative, and certainly they
do not help the readability of this function. But they are serving a
useful purpose, so I think they are worthwhile overall.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
13d746a303 builtin/repack.c: assign pack split later
To determine the where to place the split when repacking with the
'--geometric' option, split_pack_geometry() assigns the "split" variable
and then decrements it in a loop.

It would be equivalent (and more readable) to assign the split to the
loop position after exiting the loop, so do that instead.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
dab3247734 t7703: test --geometric repack with loose objects
We don't currently have a test that demonstrates the non-idempotent
behavior of 'git repack --geometric' with loose objects, so add one here
to make sure we don't regress in this area.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
f25e33c156 builtin/repack.c: do not repack single packs with --geometric
In 0fabafd0b9 (builtin/repack.c: add '--geometric' option, 2021-02-22),
the 'git repack --geometric' code aborts early when there is zero or one
pack.

When there are no packs, this code does the right thing by placing the
split at "0". But when there is exactly one pack, the split is placed at
"1", which means that "git repack --geometric" (with any factor)
repacks all of the objects in a single pack.

This is wasteful, and the remaining code in split_pack_geometry() does
the right thing (not repacking the objects in a single pack) even when
only one pack is present.

Loosen the guard to only stop when there aren't any packs, and let the
rest of the code do the right thing. Add a test to ensure that this is
the case.

Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
8278f87022 l10n: zh_TW.po: v2.31.0 round 2 (15 untranslated)
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2021-03-06 02:43:34 +08:00
2f176de687 l10n: bg.po: Updated Bulgarian translation (5104t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2021-03-05 12:12:34 +01:00
1ecef023a9 Merge branch 'fr_next' of github.com:jnavila/git
* 'fr_next' of github.com:jnavila/git:
  l10n: fr: v2.31 rnd 2
2021-03-05 13:47:07 +08:00
5b888ad949 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5104t0f0u)
2021-03-05 13:46:25 +08:00
be7935ed8b Merged the open-eintr workaround for macOS
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-04 15:42:50 -08:00
58d581c344 Documentation/RelNotes: improve release note for rename detection work
There were some early changes in the 2.31 cycle to optimize some setup
in diffcore-rename.c[1], some later changes to measure performance[2],
and finally some significant changes to improve rename detection
performance.  The final one was merged with the note

   Performance optimization work on the rename detection continues.

That works for the commit log, but feels misleading as a release note
since all the changes were within one cycle.  Simplify this to just

   Performance improvements for rename detection.

The former wording could be seen as hinting that more performance
improvements will come in 2.32, which is true, but we can just cover
those in the 2.32 release notes when the time comes.

[1] a5ac31b5b1 (Merge branch 'en/diffcore-rename', 2021-01-25)
[2] d3a035b055 (Merge branch 'en/merge-ort-perf', 2021-02-11)
[3] 12bd17521c (Merge branch 'en/diffcore-rename', 2021-03-01)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-04 15:38:11 -08:00
921846fa22 Merge branch 'jk/open-returns-eintr'
Work around platforms whose open() is reported to return EINTR (it
shouldn't, as we do our signals with SA_RESTART).

* jk/open-returns-eintr:
  config.mak.uname: enable OPEN_RETURNS_EINTR for macOS Big Sur
  Makefile: add OPEN_RETURNS_EINTR knob
2021-03-04 15:34:45 -08:00
068cb92300 l10n: fr: v2.31 rnd 2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-03-04 21:53:45 +01:00
85c787f1e9 Merge https://github.com/prati0100/git-gui
* https://github.com/prati0100/git-gui:
  Revert "git-gui: remove lines starting with the comment character"
2021-03-04 12:38:50 -08:00
f6a7e896b8 l10n: tr: v2.31.0-rc1
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-03-04 22:29:24 +03:00
929dc48e96 l10n: sv.po: Update Swedish translation (5104t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-03-04 19:10:43 +01:00
9b7e82b940 l10n: git.pot: v2.31.0 round 2 (9 new, 8 removed)
Generate po/git.pot from v2.31.0-rc1 for git v2.31.0 l10n round 2.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-03-04 22:41:21 +08:00
4dd8469336 Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git: (63 commits)
  Git 2.31-rc1
  Hopefully the last batch before -rc1
  Revert "commit-graph: when incompatible with graphs, indicate why"
  read-cache: make the index write buffer size 128K
  dir: fix malloc of root untracked_cache_dir
  commit-graph.c: display correct number of chunks when writing
  doc/reftable: document how to handle windows
  fetch-pack: print and use dangling .gitmodules
  fetch-pack: with packfile URIs, use index-pack arg
  http-fetch: allow custom index-pack args
  http: allow custom index-pack args
  chunk-format: add technical docs
  chunk-format: restore duplicate chunk checks
  midx: use 64-bit multiplication for chunk sizes
  midx: use chunk-format read API
  commit-graph: use chunk-format read API
  chunk-format: create read chunk API
  midx: use chunk-format API in write_midx_internal()
  midx: drop chunk progress during write
  midx: return success/failure in chunk write methods
  ...
2021-03-04 22:40:13 +08:00
df4f9e28f6 Merge branch 'py/revert-commit-comments'
This commit causes breakage on macOS, or in fact any platform using
older versions of Tcl. Revert it.

* py/revert-commit-comments:
  Revert "git-gui: remove lines starting with the comment character"
2021-03-04 13:59:45 +05:30
c0698df057 Revert "git-gui: remove lines starting with the comment character"
This reverts commit b9a43869c9.

This commit causes breakage on macOS (10.13). It causes errors on
startup and completely breaks the commit functionality. There are two
main problems. First, it uses `string cat` which is not supported on
older Tcl versions. Second, it does a half close of the bidirectional
pipe to git-stripspace which is also not supported on older Tcl
versions.

Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2021-03-04 13:53:27 +05:30
ea7e63921c doc: .gitignore documentation typofix
Signed-off-by: Julien Richard <julien.richard@ubisoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 17:16:48 -08:00
12604a8d0c t9801: replace test -f with test_path_is_file
Although `test -f` has the same functionality as test_path_is_file(), in
the case where test_path_is_file() fails, we get much better debugging
information.

Replace `test -f` with test_path_is_file so that future developers
will have a better experience debugging these test cases.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 17:11:31 -08:00
93c3d297b5 git mv foo FOO ; git mv foo bar gave an assert
The following sequence, on a case-insensitive file system,
(strictly speeking with core.ignorecase=true)
leads to an assertion failure and leaves .git/index.lock behind.

git init
echo foo >foo
git add foo
git mv foo FOO
git mv foo bar

This regression was introduced in Commit 9b906af657,
"git-mv: improve error message for conflicted file"

The bugfix is to change the "file exist case-insensitive in the index"
into the correct "file exist (case-sensitive) in the index".

This avoids the "assert" later in the code and keeps setting up the
"ce" pointer for ce_stage(ce) done in the next else if.

This fixes
https://github.com/git-for-windows/git/issues/2920

Reported-By: Dan Moseley <Dan.Moseley@microsoft.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 17:07:12 -08:00
f451960708 git-cat-file.txt: remove references to "sha1"
As part of the hash-transition, git can operate on more than just SHA-1
repositories. Replace "sha1"-specific documentation with hash-agnostic
terminology.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 16:43:06 -08:00
4f0ba2d533 git-cat-file.txt: monospace args, placeholders and filenames
In modern documentation, args, placeholders and filenames are
monospaced. Apply monospace formatting to these objects.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 16:43:03 -08:00
f01623b2c9 Git 2.31-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-02 22:41:13 -08:00
3ed77c4792 l10n: tr: v2.31.0-rc0
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-03-02 20:14:46 +08:00
ec125d1bc1 Hopefully the last batch before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 14:02:58 -08:00
9889cff6d6 Merge branch 'jh/untracked-cache-fix'
An under-allocation for the untracked cache data has been corrected.

* jh/untracked-cache-fix:
  dir: fix malloc of root untracked_cache_dir
2021-03-01 14:02:58 -08:00
ada7c5fae5 Merge branch 'ns/raise-write-index-buffer-size'
Raise the buffer size used when writing the index file out from
(obviously too small) 8kB to (clearly sufficiently large) 128kB.

* ns/raise-write-index-buffer-size:
  read-cache: make the index write buffer size 128K
2021-03-01 14:02:58 -08:00
28714238c8 Merge branch 'hv/trailer-formatting'
The logic to handle "trailer" related placeholders in the
"--format=" mechanisms in the "log" family and "for-each-ref"
family is getting unified.

* hv/trailer-formatting:
  ref-filter: use pretty.c logic for trailers
  pretty.c: capture invalid trailer argument
  pretty.c: refactor trailer logic to `format_set_trailers_options()`
  t6300: use function to test trailer options
2021-03-01 14:02:58 -08:00
18aabfaee5 Merge branch 'hn/reftable-tables-doc-update'
Documentation update.

* hn/reftable-tables-doc-update:
  doc/reftable: document how to handle windows
2021-03-01 14:02:57 -08:00
fbad3505ee Merge branch 'sv/t7001-modernize'
Test script modernization.

* sv/t7001-modernize:
  t7001: use `test` rather than `[`
  t7001: use here-docs instead of echo
  t7001: put each command on a separate line
  t7001: use '>' rather than 'touch'
  t7001: avoid using `cd` outside of subshells
  t7001: remove whitespace after redirect operators
  t7001: modernize subshell formatting
  t7001: remove unnecessary blank lines
  t7001: indent with TABs instead of spaces
  t7001: modernize test formatting
2021-03-01 14:02:57 -08:00
6ee353d42f Merge branch 'jt/transfer-fsck-across-packs'
The approach to "fsck" the incoming objects in "index-pack" is
attractive for performance reasons (we have them already in core,
inflated and ready to be inspected), but fundamentally cannot be
applied fully when we receive more than one pack stream, as a tree
object in one pack may refer to a blob object in another pack as
".gitmodules", when we want to inspect blobs that are used as
".gitmodules" file, for example.  Teach "index-pack" to emit
objects that must be inspected later and check them in the calling
"fetch-pack" process.

* jt/transfer-fsck-across-packs:
  fetch-pack: print and use dangling .gitmodules
  fetch-pack: with packfile URIs, use index-pack arg
  http-fetch: allow custom index-pack args
  http: allow custom index-pack args
2021-03-01 14:02:57 -08:00
660dd97a62 Merge branch 'ds/chunked-file-api'
The common code to deal with "chunked file format" that is shared
by the multi-pack-index and commit-graph files have been factored
out, to help codepaths for both filetypes to become more robust.

* ds/chunked-file-api:
  commit-graph.c: display correct number of chunks when writing
  chunk-format: add technical docs
  chunk-format: restore duplicate chunk checks
  midx: use 64-bit multiplication for chunk sizes
  midx: use chunk-format read API
  commit-graph: use chunk-format read API
  chunk-format: create read chunk API
  midx: use chunk-format API in write_midx_internal()
  midx: drop chunk progress during write
  midx: return success/failure in chunk write methods
  midx: add num_large_offsets to write_midx_context
  midx: add pack_perm to write_midx_context
  midx: add entries to write_midx_context
  midx: use context in write_midx_pack_names()
  midx: rename pack_info to write_midx_context
  commit-graph: use chunk-format write API
  chunk-format: create chunk format write API
  commit-graph: anonymize data in chunk_write_fn
2021-03-01 14:02:57 -08:00
12bd17521c Merge branch 'en/diffcore-rename'
Performance optimization work on the rename detection continues.

* en/diffcore-rename:
  merge-ort: call diffcore_rename() directly
  gitdiffcore doc: mention new preliminary step for rename detection
  diffcore-rename: guide inexact rename detection based on basenames
  diffcore-rename: complete find_basename_matches()
  diffcore-rename: compute basenames of source and dest candidates
  t4001: add a test comparing basename similarity and content similarity
  diffcore-rename: filter rename_src list when possible
  diffcore-rename: no point trying to find a match better than exact
2021-03-01 14:02:56 -08:00
700696bcfc Merge branch 'jh/fsmonitor-prework'
Preliminary changes to fsmonitor integration.

* jh/fsmonitor-prework:
  fsmonitor: refactor initialization of fsmonitor_last_update token
  fsmonitor: allow all entries for a folder to be invalidated
  fsmonitor: log FSMN token when reading and writing the index
  fsmonitor: log invocation of FSMonitor hook to trace2
  read-cache: log the number of scanned files to trace2
  read-cache: log the number of lstat calls to trace2
  preload-index: log the number of lstat calls to trace2
  p7519: add trace logging during perf test
  p7519: move watchman cleanup earlier in the test
  p7519: fix watchman watch-list test on Windows
  p7519: do not rely on "xargs -d" in test
2021-03-01 14:02:56 -08:00
273c9901c2 pretty: document multiple %(describe) being inconsistent
Each %(describe) placeholder is expanded using a separate git describe
call.  Their outputs depend on the tags present at the time, so there's
no consistency guarantee.  Document that fact.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:50:27 -08:00
09fe8ca92e t4205: assert %(describe) test coverage
Document that the test is covering both describable and
undescribable commits.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:42:17 -08:00
90917373cd Merge https://github.com/prati0100/git-gui
* https://github.com/prati0100/git-gui:
  git-gui: remove lines starting with the comment character
  git-gui: fix typo in russian locale
2021-03-01 09:22:18 -08:00
c0b27e3964 Merge branch 'js/commit-graph-warning'
* js/commit-graph-warning:
  Revert "commit-graph: when incompatible with graphs, indicate why"
2021-03-01 09:21:24 -08:00
cdc986a7c2 Revert "commit-graph: when incompatible with graphs, indicate why"
This reverts commit c85eec7fc3, as
it is a bit overzealous, we are in prerelease freeze, and we want
to have enough time to get this right and cook in 'next'.

cf. <8735xgkvuo.fsf@evledraar.gmail.com>
2021-03-01 09:19:37 -08:00
bbabaad298 config.mak.uname: enable OPEN_RETURNS_EINTR for macOS Big Sur
We've had mixed reports on whether the latest release of macOS needs
this Makefile knob set. In most reported cases, there's antivirus
software running (which one might imagine could cause an open() call to
be delayed). However, one of the (off-list) reports I've gotten
indicated that it happened on an otherwise clean install of Big Sur.

Since the symptom is so bad (checkout randomly fails to write several
fails when the progress meter kicks in), and since the workaround is so
lightweight (if we don't see EINTR, it's just an extra conditional
check), let's just turn it on by default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:07:45 -08:00
23c781f173 githooks.txt: clarify documentation on reference-transaction hook
The reference-transaction hook doesn't clearly document its scope and
what values it receives as input. Document it to make it less surprising
and clearly delimit its (current) scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:02:01 -08:00
5f308a89d8 githooks.txt: replace mentions of SHA-1 specific properties
The githooks(5) documentation states in several places that the hook
will receive a SHA-1 or hashes of 40 characters length. Given that we're
transitioning to a world where both SHA-1 and SHA-256 are supported,
this is inaccurate.

Fix the issue by replacing mentions of SHA-1 with "object name" and not
explicitly mentioning the hash size.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:02:01 -08:00
75f5efcba2 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5103t0f0u)
2021-03-01 10:01:02 +08:00
0b71d789a8 Merge branch 'pl' of github.com:Arusekk/git-po
* 'pl' of github.com:Arusekk/git-po:
  l10n: pl.po: Update translation
2021-03-01 09:59:07 +08:00
fe8885258b l10n: sv.po: Update Swedish translation (5103t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-02-28 22:22:46 +01:00
fa42d191c6 l10n: pl.po: Update translation
Signed-off-by: Arusekk <arek_koz@o2.pl>
2021-02-27 17:17:32 +01:00
5ff5a30652 l10n: fr: v2.31.0 rnd 1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-02-27 15:47:45 +01:00
81afdf7a2e diffcore-rename: compute dir_rename_guess from dir_rename_counts
dir_rename_counts has a mapping of a mapping, in particular, it has
   old_dir => { new_dir => count }
We want a simple mapping of
   old_dir => new_dir
based on which new_dir had the highest count for a given old_dir.
Compute this and store it in dir_rename_guess.

This is the final piece of the puzzle needed to make our guesses at
which directory files have been moved to when basenames aren't unique.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       12.775 s ±  0.062 s    12.596 s ±  0.061 s
    mega-renames:    188.754 s ±  0.284 s   130.465 s ±  0.259 s
    just-one-mega:     5.599 s ±  0.019 s     3.958 s ±  0.010 s

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:12 -08:00
333899e1e3 diffcore-rename: limit dir_rename_counts computation to relevant dirs
We are using dir_rename_counts to count the number of other directories
that files within a directory moved to.  We only need this information
for directories that disappeared, though, so we can return early from
update_dir_rename_counts() for other paths.

If dirs_removed is passed to diffcore_rename_extended(), then it
provides the relevant bits of information for us to limit this counting
to relevant dirs.  If dirs_removed is not passed, we would need to
compute some replacement in order to do this limiting.  Introduce a new
info->relevant_source_dirs variable for this purpose, even though at
this stage we will only set it to dirs_removed for simplicity.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:12 -08:00
1ad69eb0dc diffcore-rename: compute dir_rename_counts in stages
Compute dir_rename_counts based just on exact renames to start, as that
can provide us useful information in find_basename_matches().  This is
done by moving the code from compute_dir_rename_counts() into
initialize_dir_rename_info(), resulting in it being computed earlier and
based just on exact renames.  Since that's an incomplete result, we
augment the counts via calling update_dir_rename_counts() after each
basename-guide and inexact rename detection match is found.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:12 -08:00
b1473019e8 diffcore-rename: extend cleanup_dir_rename_info()
When diffcore_rename_extended() is passed a NULL dir_rename_count, we
will still want to create a temporary one for use by
find_basename_matches(), but have it fully deallocated before
diffcore_rename_extended() returns.  However, when
diffcore_rename_extended() is passed a dir_rename_count, we want to fill
that strmap with appropriate values and return it.  However, for our
interim purposes we may also add entries corresponding to directories
that cannot have been renamed due to still existing on both sides.

Extend cleanup_dir_rename_info() to handle these two different cases,
cleaning up the relevant bits of information for each case.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:12 -08:00
b6e3d27434 diffcore-rename: move dir_rename_counts into dir_rename_info struct
This continues the migration of the directory rename detection code into
diffcore-rename, now taking the simple step of combining it with the
dir_rename_info struct.  Future commits will then make dir_rename_counts
be computed in stages, and add computation of dir_rename_guess.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
cd52e0050f diffcore-rename: add function for clearing dir_rename_count
As we adjust the usage of dir_rename_count we want to have a function
for clearing, or partially clearing it out.  Add a
partial_clear_dir_rename_count() function for this purpose.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
0c4fd732f0 Move computation of dir_rename_count from merge-ort to diffcore-rename
Move the computation of dir_rename_count from merge-ort.c to
diffcore-rename.c, making slight adjustments to the data structures
based on the move.  While the diffstat looks large, viewing this commit
with --color-moved makes it clear that only about 20 lines changed.

With this patch, the computation of dir_rename_count is still only done
after inexact rename detection, but subsequent commits will add a
preliminary computation of dir_rename_count after exact rename
detection, followed by some updates after inexact rename detection.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
ae8cf74d3f diffcore-rename: add a mapping of destination names to their indices
Compute a mapping of full filename to the index within rename_dst where
that filename is found, and store it in idx_map.  idx_possible_rename()
needs this to quickly finding an array entry in rename_dst given the
pathname.

While at it, add placeholder initializations for dir_rename_count and
dir_rename_guess; these will be more fully populated in subsequent
commits.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
bde8b9f34c diffcore-rename: provide basic implementation of idx_possible_rename()
Add a new struct dir_rename_info with various values we need inside our
idx_possible_rename() function introduced in the previous commit.  Add a
basic implementation for this function showing how we plan to use the
variables, but which will just return early with a value of -1 (not
found) when those variables are not set up.

Future commits will do the work necessary to set up those other
variables so that idx_possible_rename() does not always return -1.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
37a2514364 diffcore-rename: use directory rename guided basename comparisons
A previous commit noted that it is very common for people to move files
across directories while keeping their filename the same.  The last few
commits took advantage of this and showed that we can accelerate rename
detection significantly using basenames; since files with the same
basename serve as likely rename candidates, we can check those first and
remove them from the rename candidate pool if they are sufficiently
similar.

Unfortunately, the previous optimization was limited by the fact that
the remaining basenames after exact rename detection are not always
unique.  Many repositories have hundreds of build files with the same
name (e.g. Makefile, .gitignore, build.gradle, etc.), and may even have
hundreds of source files with the same name.  (For example, the linux
kernel has 100 setup.c, 87 irq.c, and 112 core.c files.  A repository at
$DAYJOB has a lot of ObjectFactory.java and Plugin.java files).

For these files with non-unique basenames, we are faced with the task of
attempting to determine or guess which directory they may have been
relocated to.  Such a task is precisely the job of directory rename
detection.  However, there are two catches: (1) the directory rename
detection code has traditionally been part of the merge machinery rather
than diffcore-rename.c, and (2) directory rename detection currently
runs after regular rename detection is complete.  The 1st catch is just
an implementation issue that can be overcome by some code shuffling.
The 2nd requires us to add a further approximation: we only have access
to exact renames at this point, so we need to do directory rename
detection based on just exact renames.  In some cases we won't have
exact renames, in which case this extra optimization won't apply.  We
also choose to not apply the optimization unless we know that the
underlying directory was removed, which will require extra data to be
passed in to diffcore_rename_extended().  Also, even if we get a
prediction about which directory a file may have relocated to, we will
still need to check to see if there is a file in the predicted
directory, and then compare the two files to see if they meet the higher
min_basename_score threshold required for marking the two files as
renames.

This commit introduces an idx_possible_rename() function which will
do this directory rename detection for us and give us the index within
rename_dst of the resulting filename.  For now, this function is
hardcoded to return -1 (not found) and just hooks up how its results
would be used once we have a more complete implementation in place.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
66f52fa26b pack-revindex.c: don't close unopened file descriptors
When opening a reverse index, load_revindex_from_disk() jumps to the
'cleanup' label in case something goes wrong: the reverse index had the
wrong size, an unrecognized version, or similar.

It also jumps to this label when the reverse index couldn't be opened in
the first place, which will cause an error with the unguarded close()
call in the label.

Guard this call with "if (fd >= 0)" to make sure that we have a valid
file descriptor to close before attempting to close it.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:42:27 -08:00
36e834abc1 t/perf: avoid copying worktree files from test repo
When running the perf suite, we copy files from an existing $GIT_DIR to
a scratch repository to give us a realistic setup on which to operate.
Since the perf scripts themselves may modify the scratch repository, we
want to make sure we've scrubbed any references back to the original.

One existing example is that we avoid copying the file "commondir" at
the top-level of the repository. In a worktree git-dir (e.g.,
.git/worktrees/foo), that file contains the path to the parent
repository; copying it could mean ref updates in the scratch repository
affect the original.

But there are other files we should cover, too:

  - "gitdir" in a worktree git-dir contains the path to the actual .git
    file in the working tree. We _shouldn't_ end up looking at it at
    all, since the lack of a "commondir" file means Git won't consider
    this to be a worktree git-dir. But it's best to err on the safe
    side.

  - in a parent repository that contains worktrees, the
    "$GIT_DIR/worktrees" directory will contain the git dirs for the
    individual worktrees. Which will themselves contain commondir and
    gitdir files that may reference the original repository. We should
    likewise remove them.

    Note that this does mean that the perf suite's scratch repositories
    will never have any worktrees. That's OK; we don't have any perf tests
    that are influenced by their presence. If we add any, they'd
    probably want to create the worktrees themselves anyway.

This patch adds both paths to the set of omissions in
test_perf_copy_repo_contents(). Note that we won't get confused here by
matching arbitrary names like refs/heads/commondir. This list is always
matching top-level entries in $GIT_DIR (we rely on "cp -R" to do the
actual recursion).

Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:21:04 -08:00
85b87a5396 t/perf: handle worktrees as test repos
The perf suite gets confused when test_perf_default_repo is pointed at a
worktree (which includes when it is run from within a worktree at all,
since the default is to use the current repository).

Here's an example:

  $ git worktree add ~/foo
  Preparing worktree (new branch 'foo')
  HEAD is now at 328c109303 The eighth batch
  $ cd ~/foo
  $ make
  [...build output...]
  $ cd t/perf
  $ ./p0000-perf-lib-sanity.sh -v -i
  [...]
  perf 1 - test_perf_default_repo works:
  running:
  	foo=$(git rev-parse HEAD) &&
  	test_export foo

  fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

The problem is that we didn't copy all of the necessary files from the
source repository (in this case we got HEAD, but we have no refs!). We
discover the git-dir with "rev-parse --git-dir", but this points to the
worktree's partial repository in .../.git/worktrees/foo.

That partial repository has a "commondir" file which points to the main
repository, where the actual refs are stored, but we don't copy it. This
is the correct thing to do, though! If we did copy it, then our scratch
test repo would be pointing back to the original main repo, and any ref
updates we made in the tests would impact that original repo.

Instead, we need to either:

  1. Make a scratch copy of the original main repo (in addition to the
     worktree repo), and point the scratch worktree repo's commondir at
     it. This preserves the original relationship, but it's doubtful any
     script really cares (if they are testing worktree performance,
     they'd probably make their own worktrees). And it's trickier to get
     right.

  2. Collapse the main and worktree repos into a single scratch repo.
     This can be done by copying everything from both, preferring any
     files from the worktree repo.

This patch does the second one. With this applied, the example above
results in p0000 running successfully.

Reported-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:21:04 -08:00
2b08101204 Makefile: add OPEN_RETURNS_EINTR knob
On some platforms, open() reportedly returns EINTR when opening regular
files and we receive a signal (usually SIGALRM from our progress meter).
This shouldn't happen, as open() should be a restartable syscall, and we
specify SA_RESTART when setting up the alarm handler. So it may actually
be a kernel or libc bug for this to happen. But it has been reported on
at least one version of Linux (on a network filesystem):

  https://lore.kernel.org/git/c8061cce-71e4-17bd-a56a-a5fed93804da@neanderfunk.de/

as well as on macOS starting with Big Sur even on a regular filesystem.

We can work around it by retrying open() calls that get EINTR, just as
we do for read(), etc. Since we don't ever _want_ to interrupt an open()
call, we can get away with just redefining open, rather than insisting
all callsites use xopen().

We actually do have an xopen() wrapper already (and it even does this
retry, though there's no indication of it being an observed problem back
then; it seems simply to have been lifted from xread(), etc). But it is
used hardly anywhere, and isn't suitable for general use because it will
die() on error. In theory we could combine the two, but it's awkward to
do so because of the variable-args interface of open().

This patch adds a Makefile knob for enabling the workaround. It's not
enabled by default for any platforms in config.mak.uname yet, as we
don't have enough data to decide how common this is (I have not been
able to reproduce on either Linux or Big Sur myself). It may be worth
enabling preemptively anyway, since the cost is pretty low (if we don't
see an EINTR, it's just an extra conditional).

However, note that we must not enable this on Windows. It doesn't do
anything there, and the macro overrides the existing mingw_open()
redirection. I've added a preemptive #undef here in the mingw header
(which is processed first) to just quietly disable it (we could also
make it an #error, but there is little point in being so aggressive).

Reported-by: Aleksey Kliger <alklig@microsoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:15:51 -08:00
6fab35f748 convert: fail gracefully upon missing clean cmd on required filter
The gitattributes documentation mentions that either the clean cmd or
the smudge cmd can be left unspecified in a filter definition. However,
when the filter is marked as 'required', the absence of any one of these
two should be treated as an error. Git already fails under these
circumstances, but not always in a pleasant way: omitting a clean cmd in
a required filter triggers an assertion error which leaves the user with
a quite verbose message:

git: convert.c:1459: convert_to_git_filter_fd: Assertion "ca.drv->clean || ca.drv->process" failed.

This assertion is not really necessary, as the apply_filter() call below
it already performs the same check. And when this condition is not met,
the function returns 0, making the caller die() with a much nicer
message. (Also note that die()-ing here is the right behavior as
`would_convert_to_git_filter_fd() == true` is a precondition to use
convert_to_git_filter_fd(), and the former is only true when the filter
is required.) So remove the assertion and add two regression tests to
make sure that git fails nicely when either the smudge or clean command
is missing on a required filter.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 11:20:02 -08:00
712b0ed6ec l10n: git.pot: v2.31.0 round 1 (155 new, 89 removed)
Generate po/git.pot from v2.31.0-rc0 for git v2.31.0 l10n round 1.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-02-26 22:09:42 +08:00
225365fb51 Git 2.31-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25 16:43:33 -08:00
140045821a Merge branch 'jc/push-delete-nothing'
"git push $there --delete ''" should have been diagnosed as an
error, but instead turned into a matching push, which has been
corrected.

* jc/push-delete-nothing:
  push: do not turn --delete '' into a matching push
2021-02-25 16:43:33 -08:00
cadae717d5 Merge branch 'sh/mergetools-vimdiff1'
Mergetools update.

* sh/mergetools-vimdiff1:
  mergetools/vimdiff: add vimdiff1 merge tool variant
2021-02-25 16:43:32 -08:00
09e72204f8 Merge branch 'dl/doc-config-camelcase'
A handful of multi-word configuration variable names in
documentation that are spelled in all lowercase have been corrected
to use the more canonical camelCase.

* dl/doc-config-camelcase:
  index-format doc: camelCase core.excludesFile
  blame-options.txt: camelcase blame.blankBoundary
  i18n.txt: camel case and monospace "i18n.commitEncoding"
2021-02-25 16:43:32 -08:00
1c8f5dfa42 Merge branch 'js/params-vs-args'
Messages update.

* js/params-vs-args:
  replace "parameters" by "arguments" in error messages
2021-02-25 16:43:32 -08:00
d228b6b231 Merge branch 'ug/doc-commit-approxidate'
Doc update.

* ug/doc-commit-approxidate:
  doc: mention approxidates for git-commit --date
2021-02-25 16:43:32 -08:00
d166e8c1d4 Merge branch 'es/maintenance-of-bare-repositories'
The "git maintenance register" command had trouble registering bare
repositories, which had been corrected.

* es/maintenance-of-bare-repositories:
  maintenance: fix incorrect `maintenance.repo` path with bare repository
2021-02-25 16:43:32 -08:00
f277234860 Merge branch 'mt/add-chmod-fixes'
Various fixes on "git add --chmod".

* mt/add-chmod-fixes:
  add: propagate --chmod errors to exit status
  add: mark --chmod error string for translation
  add --chmod: don't update index when --dry-run is used
2021-02-25 16:43:31 -08:00
48923e8356 Merge branch 'ds/merge-base-independent'
The code to implement "git merge-base --independent" was poorly
done and was kept from the very beginning of the feature.

* ds/merge-base-independent:
  commit-reach: stale commits may prune generation further
  commit-reach: use heuristic in remove_redundant()
  commit-reach: move compare_commits_by_gen
  commit-reach: use one walk in remove_redundant()
  commit-reach: reduce requirements for remove_redundant()
2021-02-25 16:43:31 -08:00
682bbad64d Merge branch 'ah/rebase-no-fork-point-config'
"git rebase --[no-]fork-point" gained a configuration variable
rebase.forkPoint so that users do not have to keep specifying a
non-default setting.

* ah/rebase-no-fork-point-config:
  rebase: add a config option for --no-fork-point
2021-02-25 16:43:31 -08:00
628c13ccee Merge branch 'mt/grep-sparse-checkout'
"git grep" has been tweaked to be limited to the sparse checkout
paths.

* mt/grep-sparse-checkout:
  grep: honor sparse-checkout on working tree searches
2021-02-25 16:43:31 -08:00
3c8e6dda21 Merge branch 'ah/commit-graph-leakplug'
Plug a minor memory leak.

* ah/commit-graph-leakplug:
  commit-graph: avoid leaking topo_levels slab in write_commit_graph()
2021-02-25 16:43:31 -08:00
6eea44cee1 Merge branch 'zh/difftool-skip-to'
"git difftool" learned "--skip-to=<path>" option to restart an
interrupted session from an arbitrary path.

* zh/difftool-skip-to:
  difftool.c: learn a new way start at specified file
2021-02-25 16:43:31 -08:00
ccf6861b72 Merge branch 'cw/pack-config-doc'
Doc update.

* cw/pack-config-doc:
  doc: mention bigFileThreshold for packing
2021-02-25 16:43:31 -08:00
dddb420535 Merge branch 'jc/maint-column-doc-typofix'
Doc update.

* jc/maint-column-doc-typofix:
  Documentation: typofix --column description
2021-02-25 16:43:30 -08:00
2638e33c82 Merge branch 'ma/doc-markup-fix'
Docfix.

* ma/doc-markup-fix:
  gitmailmap.txt: fix rendering of e-mail addresses
  git.txt: fix monospace rendering
  rev-list-options.txt: fix rendering of bonus paragraph
2021-02-25 16:43:30 -08:00
845d6030f8 Merge branch 'jc/diffcore-rotate'
"git {diff,log} --{skip,rotate}-to=<path>" allows the user to
discard diff output for early paths or move them to the end of the
output.

* jc/diffcore-rotate:
  diff: --{rotate,skip}-to=<path>
2021-02-25 16:43:30 -08:00
3da165ca28 Merge branch 'mt/checkout-index-corner-cases'
The error codepath around the "--temp/--prefix" feature of "git
checkout-index" has been improved.

* mt/checkout-index-corner-cases:
  checkout-index: omit entries with no tempname from --temp output
  write_entry(): fix misuses of `path` in error messages
2021-02-25 16:43:30 -08:00
f47c3328ef Merge branch 'js/doc-proto-v2-response-end'
Docfix.

* js/doc-proto-v2-response-end:
  doc: fix naming of response-end-pkt
2021-02-25 16:43:30 -08:00
18decfd11d Merge branch 'rs/blame-optim'
Optimization in "git blame"

* rs/blame-optim:
  blame: remove unnecessary use of get_commit_info()
2021-02-25 16:43:29 -08:00
d590ae5560 Merge branch 'mz/doc-notes-are-not-anchors'
Objects that lost references can be pruned away, even when they
have notes attached to it (and these notes will become dangling,
which in turn can be pruned with "git notes prune").  This has been
clarified in the documentation.

* mz/doc-notes-are-not-anchors:
  docs: clarify that refs/notes/ do not keep the attached objects alive
2021-02-25 16:43:29 -08:00
608cc4f273 Merge branch 'ab/detox-gettext-tests'
Removal of GIT_TEST_GETTEXT_POISON continues.

* ab/detox-gettext-tests:
  tests: remove most uses of test_i18ncmp
  tests: remove last uses of C_LOCALE_OUTPUT
  tests: remove most uses of C_LOCALE_OUTPUT
  tests: remove last uses of GIT_TEST_GETTEXT_POISON=false
2021-02-25 16:43:29 -08:00
6fe12b5215 Merge branch 'jk/rev-list-disk-usage'
"git rev-list" command learned "--disk-usage" option.

* jk/rev-list-disk-usage:
  docs/rev-list: add some examples of --disk-usage
  docs/rev-list: add an examples section
  rev-list: add --disk-usage option for calculating disk usage
  t: add --no-tag option to test_commit
2021-02-25 16:43:29 -08:00
702110aac6 commit-graph: use config to specify generation type
We have two established generation number versions:

 1: topological levels
 2: corrected commit dates

The corrected commit dates are enabled by default, but they also write
extra data in the GDAT and GDOV chunks. Services that host Git data
might want to have more control over when this feature rolls out than
just updating the Git binaries.

Add a new "commitGraph.generationVersion" config option that specifies
the intended generation number version. If this value is less than 2,
then the GDAT chunk is never written _or read_ from an existing file.

This can replace our use of the GIT_TEST_COMMIT_GRAPH_NO_GDAT
environment variable in the test suite. Remove it.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25 15:10:41 -08:00
c7ef8fe608 commit-graph: create local repository pointer
The write_commit_graph() method uses 'the_repository' in a few places. A
new need for a repository pointer is coming in the following change, so
group these instances into a local variable 'r' that could eventually
become part of the method signature, if so desired.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25 15:10:40 -08:00
0f1da600e6 remote: write camel-cased *.pushRemote on rename
When a remote is renamed don't change the canonical "*.pushRemote"
form to "*.pushremote". Fixes and tests for a minor bug in
923d4a5ca4 (remote rename/remove: handle branch.<name>.pushRemote
config values, 2020-01-27). See the preceding commit for why this does
& doesn't matter.

While we're at it let's also test that we handle the "*.pushDefault"
key correctly. The code to handle that was added in
b3fd6cbf29 (remote rename/remove: gently handle remote.pushDefault
config, 2020-02-01) and does the right thing, but nothing tested that
we wrote out the canonical camel-cased form.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 19:03:00 -08:00
bfa9148ff7 remote: add camel-cased *.tagOpt key, like clone
Change "git remote add" so that it adds a *.tagOpt key, and not the
lower-cased *.tagopt on "git remote add --no-tags", just as "git clone
--no-tags" would do.

This doesn't matter for anything that reads the config. It's just
prettier if we write config keys in their documented camelCase form to
user-readable config files.

When I added support for "clone -no-tags" in 0dab2468ee (clone: add a
--no-tags option to clone without tags, 2017-04-26) I made it use
the *.tagOpt form, but the older "git remote add" added in
111fb85865 (remote add: add a --[no-]tags option, 2010-04-20) has
been using *.tagopt all this time.

It's easy enough to add a test for this, so let's do that. We can't
use "git config -l" there, because it'll normalize the keys to their
lower-cased form. Let's add the test for "git clone" too for good
measure, not just to the "git remote" codepath we're fixing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 19:02:58 -08:00
11875561bf Merge branch 'ds/chunked-file-api' into tb/reverse-midx
* ds/chunked-file-api:
  commit-graph.c: display correct number of chunks when writing
  chunk-format: add technical docs
  chunk-format: restore duplicate chunk checks
  midx: use 64-bit multiplication for chunk sizes
  midx: use chunk-format read API
  commit-graph: use chunk-format read API
  chunk-format: create read chunk API
  midx: use chunk-format API in write_midx_internal()
  midx: drop chunk progress during write
  midx: return success/failure in chunk write methods
  midx: add num_large_offsets to write_midx_context
  midx: add pack_perm to write_midx_context
  midx: add entries to write_midx_context
  midx: use context in write_midx_pack_names()
  midx: rename pack_info to write_midx_context
  commit-graph: use chunk-format write API
  chunk-format: create chunk format write API
  commit-graph: anonymize data in chunk_write_fn
2021-02-24 15:26:14 -08:00
7dd0eaa39c index-format doc: camelCase core.excludesFile
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 15:21:25 -08:00
edaf10dd26 blame-options.txt: camelcase blame.blankBoundary
All other references to blame.* configuration variables are
camelCased already.  Update this one to match.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 15:21:25 -08:00
77645b5daa i18n.txt: camel case and monospace "i18n.commitEncoding"
In 95791be750 (doc: camelCase the i18n config variables to improve
readability, 2017-07-17), the other i18n config variables were
camel cased. However, this one instance was missed.

Camel case and monospace "i18n.commitEncoding" so that it matches the
surrounding text.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
[jc: fixed 3 other mistakes that are exactly the same]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 15:21:25 -08:00
f279894d28 read-cache: make the index write buffer size 128K
Writing an index 8K at a time invokes the OS filesystem and caching code
very frequently, introducing noticeable overhead while writing large
indexes. When experimenting with different write buffer sizes on Windows
writing the Windows OS repo index (260MB), most of the benefit came by
bumping the index write buffer size to 64K. I picked 128K to ensure that
we're past the knee of the curve.

With this change, the time under do_write_index for an index with 3M
files goes from ~1.02s to ~0.72s.

Signed-off-by: Neeraj Singh <neerajsi@ntdev.microsoft.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 13:40:30 -08:00
9ebd7fe158 add: propagate --chmod errors to exit status
If `add` encounters an error while applying the --chmod changes, it
prints a message to stderr, but exits with a success code. This might
have been an oversight, as the command does exit with a non-zero code in
other situations where it cannot (or refuses to) update all of the
requested paths (e.g. when some of the given paths are ignored). So make
the exit behavior more consistent by also propagating --chmod errors to
the exit status.

Note: the test "all statuses changed in folder if . is given" uses paths
added by previous test cases, some of which might be symbolic links.
Because `git add --chmod` will now fail with such paths, this test would
depend on whether all the previous tests were executed, or only some
of them. Avoid that by running the test on a fresh repo with only
regular files.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:14:51 -08:00
48960894f5 add: mark --chmod error string for translation
This error message is intended for humans, so mark it for translation.
Also use error() instead of fprintf(stderr, ...), to make the
corresponding line a bit cleaner, and to display the "error:" prefix,
which helps classifying the nature/severity of the message.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:14:51 -08:00
c937d70bfb add --chmod: don't update index when --dry-run is used
`git add --chmod` applies the mode changes even when `--dry-run` is
used. Fix that and add some tests for this option combination.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:14:51 -08:00
6347d649bc dir: fix malloc of root untracked_cache_dir
Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
for the root directory.  Get rid of unsafe code that might fail to
initialize the `name` field (if FLEX_ARRAY is not 1).  This will
make it clear that we intend to have a structure with an empty
string following it.

A problem was observed on Windows where the length of the memset() was
too short, so the first byte of the name field was not zeroed.  This
resulted in the name field having garbage from a previous use of that
area of memory.

The record for the root directory was then written to the untracked-cache
extension in the index.  This garbage would then be visible to future
commands when they reloaded the untracked-cache extension.

Since the directory record for the root directory had garbage in the
`name` field, the `t/helper/test-tool dump-untracked-cache` tool
printed this garbage as the path prefix (rather than '/') for each
directory in the untracked cache as it recursed.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:09:10 -08:00
2803d800d2 rebase: add a config option for --no-fork-point
Some users (myself included) would prefer to have this feature off by
default because it can silently drop commits.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 11:49:10 -08:00
c4ff24bbb3 commit-graph.c: display correct number of chunks when writing
When writing a commit-graph, a progress meter is shown which indicates
the number of pieces of data to write (one per commit in each chunk).

In 47410aa837 (commit-graph: use chunk-format write API, 2021-02-18),
the number of chunks became tracked by the new chunk-format API. But a
stray local variable was left behind from when write_commit_graph_file()
used to keep track of the same.

Since this was no longer updated after 47410aa837, the progress meter
appeared broken:

    $ git commit-graph write --reachable
    Expanding reachable commits in commit graph: 837569, done.
    Writing out commit graph in 3 passes: 166% (4187845/2512707), done.

Drop the local variable and rely instead on the chunk-format API to tell
us the correct number of chunks.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 11:44:34 -08:00
a9926ecd54 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-02-24 09:14:56 +01:00
20e416409f push: do not turn --delete '' into a matching push
When we added a syntax sugar "git push remote --delete <ref>" to
"git push" as a synonym to the canonical "git push remote :<ref>"
syntax at f517f1f2 (builtin-push: add --delete as syntactic sugar
for :foo, 2009-12-30), we weren't careful enough to make sure that
<ref> is not empty.

Blindly rewriting "--delete <ref>" to ":<ref>" means that an empty
string <ref> results in refspec ":", which is the syntax to ask for
"matching" push that does not delete anything.

Worse yet, if there were matching refs that can be fast-forwarded,
they would have been published prematurely, even if the user feels
that they are not ready yet to be pushed out, which would be a real
disaster.

Noticed-by: Tilman Vogel <tilman.vogel@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 15:19:34 -08:00
01168a9d89 doc: mention approxidates for git-commit --date
We describe the more strict date formats accepted by GIT_COMMITTER_DATE,
etc, but the --date option also allows the looser approxidate formats,
as well. Unfortunately we don't have a good or complete reference for
this format, but let's at least mention that it _is_ looser, and give a
few examples.

If we ever write separate, more complete date-format documentation, we
should refer to it from here.

Based-on-a-patch-by: Utku Gultopu <ugultopu@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 13:33:02 -08:00
b865734760 replace "parameters" by "arguments" in error messages
When an error message informs the user about an incorrect command
invocation, it should refer to "arguments", not "parameters".

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 13:30:45 -08:00
30bb8088af mergetools/vimdiff: add vimdiff1 merge tool variant
This adds yet another vimdiff/gvimdiff variant and presents conflicts as
a two-way diff between 'LOCAL' and 'REMOTE'. 'MERGED' is not opened
which deviates from the norm so usage text is echoed as a Vim message on
startup that instructs the user with how to proceed and how to abort.

Vimdiff is well-suited to two-way diffs so this is an option for a more
simple, more streamlined conflict resolution. For example: it is
difficult to communicate differences across more than two files using
only syntax highlighting; default vimdiff commands to get and put
changes between buffers do not need the user to manually specify
a source or destination buffer when only using two buffers.

Like other merge tools that directly compare 'LOCAL' with 'REMOTE', this
tool will benefit when paired with the new `mergetool.hideResolved`
setting.

Signed-off-by: Seth House <seth@eseth.com>
Tested-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 11:37:13 -08:00
00f68732e5 doc/reftable: document how to handle windows
On Windows we can't delete or overwrite files opened by other processes. Here we
sketch how to handle this situation.

We propose to use a random element in the filename. It's possible to design an
alternate solution based on counters, but that would assign semantics to the
filenames that complicates implementation.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 10:01:21 -08:00
029bac01a8 Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets
Add targets to compile the various *.o files we declared in commonly
used *_OBJS variables. This is useful for debugging purposes, to
e.g. get to the point where we can compile a git.o. See [1] for a
use-case for this target.

https://lore.kernel.org/git/YBCGtd9if0qtuQxx@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:59 -08:00
abc3c87f3d Makefile: split OBJECTS into OBJECTS and GIT_OBJS
Add a new GIT_OBJS variable, with the objects sufficient to get to a
git.o or common-main.o.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:58 -08:00
d6da8b328e Makefile: sort OBJECTS assignment for subsequent change
Change the order of the OBJECTS assignment, this makes a follow-up
change where we split it up into two variables smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:58 -08:00
752b3ef972 Makefile: split up long OBJECTS line
Split up the long OBJECTS line into multiple lines using the "+="
assignment we commonly use elsewhere in the Makefile when these lines
get unwieldy.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:58 -08:00
bed3419925 Makefile: guard against TEST_OBJS in the environment
Add TEST_OBJS to the list of other *_OBJS variables we reset. We had
already established this pattern when TEST_OBJS was introduced in
daa99a9172 (Makefile: make sure test helpers are rebuilt when headers
change, 2010-01-26), but it wasn't added to the list in that commit
along with the rest.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:58 -08:00
26c7974376 maintenance: fix incorrect maintenance.repo path with bare repository
The periodic maintenance tasks configured by `git maintenance start`
invoke `git for-each-repo` to run `git maintenance run` on each path
specified by the multi-value global configuration variable
`maintenance.repo`. Because `git for-each-repo` will likely be run
outside of the repositories which require periodic maintenance, it is
mandatory that the repository paths specified by `maintenance.repo` are
absolute.

Unfortunately, however, `git maintenance register` does nothing to
ensure that the paths it assigns to `maintenance.repo` are indeed
absolute, and may in fact -- especially in the case of a bare repository
-- assign a relative path to `maintenance.repo` instead. Fix this
problem by converting all paths to absolute before assigning them to
`maintenance.repo`.

While at it, also fix `git maintenance unregister` to convert paths to
absolute, as well, in order to ensure that it can correctly remove from
`maintenance.repo` a path assigned via `git maintenance register`.

Reported-by: Clement Moyroud <clement.moyroud@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 00:22:45 -08:00
0fabafd0b9 builtin/repack.c: add '--geometric' option
Often it is useful to both:

  - have relatively few packfiles in a repository, and

  - avoid having so few packfiles in a repository that we repack its
    entire contents regularly

This patch implements a '--geometric=<n>' option in 'git repack'. This
allows the caller to specify that they would like each pack to be at
least a factor times as large as the previous largest pack (by object
count).

Concretely, say that a repository has 'n' packfiles, labeled P1, P2,
..., up to Pn. Each packfile has an object count equal to 'objects(Pn)'.
With a geometric factor of 'r', it should be that:

  objects(Pi) > r*objects(P(i-1))

for all i in [1, n], where the packs are sorted by

  objects(P1) <= objects(P2) <= ... <= objects(Pn).

Since finding a true optimal repacking is NP-hard, we approximate it
along two directions:

  1. We assume that there is a cutoff of packs _before starting the
     repack_ where everything to the right of that cut-off already forms
     a geometric progression (or no cutoff exists and everything must be
     repacked).

  2. We assume that everything smaller than the cutoff count must be
     repacked. This forms our base assumption, but it can also cause
     even the "heavy" packs to get repacked, for e.g., if we have 6
     packs containing the following number of objects:

       1, 1, 1, 2, 4, 32

     then we would place the cutoff between '1, 1' and '1, 2, 4, 32',
     rolling up the first two packs into a pack with 2 objects. That
     breaks our progression and leaves us:

       2, 1, 2, 4, 32
         ^

     (where the '^' indicates the position of our split). To restore a
     progression, we move the split forward (towards larger packs)
     joining each pack into our new pack until a geometric progression
     is restored. Here, that looks like:

       2, 1, 2, 4, 32  ~>  3, 2, 4, 32  ~>  5, 4, 32  ~> ... ~> 9, 32
         ^                   ^                ^                   ^

This has the advantage of not repacking the heavy-side of packs too
often while also only creating one new pack at a time. Another wrinkle
is that we assume that loose, indexed, and reflog'd objects are
insignificant, and lump them into any new pack that we create. This can
lead to non-idempotent results.

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
20b031fede packfile: add kept-pack cache for find_kept_pack_entry()
In a recent patch we added a function 'find_kept_pack_entry()' to look
for an object only among kept packs.

While this function avoids doing any lookup work in non-kept packs, it
is still linear in the number of packs, since we have to traverse the
linked list of packs once per object. Let's cache a reduced version of
that list to save us time.

Note that this cache will last the lifetime of the program. We could
invalidate it on reprepare_packed_git(), but there's not much point in
being rigorous here:

  - we might already fail to notice new .keep packs showing up after the
    program starts. We only reprepare_packed_git() when we fail to find
    an object. But adding a new pack won't cause that to happen.
    Somebody repacking could add a new pack and delete an old one, but
    most of the time we'd have a descriptor or mmap open to the old
    pack anyway, so we might not even notice.

  - in pack-objects we already cache the .keep state at startup, since
    56dfeb6263 (pack-objects: compute local/ignore_pack_keep early,
    2016-07-29). So this is just extending that concept further.

  - we don't have to worry about any packed_git being removed; we always
    keep the old structs around, even after reprepare_packed_git()

We do defensively invalidate the cache in case the set of kept packs
being asked for changes (e.g., only in-core kept packs were cached, but
suddenly the caller also wants on-disk kept packs, too). In theory we
could build all three caches and switch between them, but it's not
necessary, since this patch (and series) never changes the set of kept
packs that it wants to inspect from the cache.

So that "optimization" is more about being defensive in the face of
future changes than it is about asking for multiple kinds of kept packs
in this patch.

Here are p5303 results (as always, measured against the kernel):

  Test                                        HEAD^                   HEAD
  -----------------------------------------------------------------------------------------------
  5303.5: repack (1)                          57.34(54.66+10.88)      56.98(54.36+10.98) -0.6%
  5303.6: repack with kept (1)                57.38(54.83+10.49)      57.17(54.97+10.26) -0.4%
  5303.11: repack (50)                        71.70(88.99+4.74)       71.62(88.48+5.08) -0.1%
  5303.12: repack with kept (50)              72.58(89.61+4.78)       71.56(88.80+4.59) -1.4%
  5303.17: repack (1000)                      217.19(491.72+14.25)    217.31(490.82+14.53) +0.1%
  5303.18: repack with kept (1000)            246.12(520.07+14.93)    217.08(490.37+15.10) -11.8%

and the --stdin-packs case, which scales a little bit better (although
not by that much even at 1,000 packs):

  5303.7: repack with --stdin-packs (1)       0.00(0.00+0.00)         0.00(0.00+0.00) =
  5303.13: repack with --stdin-packs (50)     3.43(11.75+0.24)        3.43(11.69+0.30) +0.0%
  5303.19: repack with --stdin-packs (1000)   130.50(307.15+7.66)     125.13(301.36+8.04) -4.1%

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
6325da14af builtin/pack-objects.c: rewrite honor-pack-keep logic
Now that we have find_kept_pack_entry(), we don't have to manually keep
hunting through every pack to find a possible "kept" duplicate of the
object. This should be faster, assuming only a portion of your total
packs are actually kept.

Note that we have to re-order the logic a bit here; we can deal with the
disqualifying situations first (e.g., finding the object in a non-local
pack with --local), then "kept" situation(s), and then just fall back to
other "--local" conditions.

Here are the results from p5303 (measurements again taken on the
kernel):

  Test                                        HEAD^                   HEAD
  -----------------------------------------------------------------------------------------------
  5303.5: repack (1)                          57.26(54.59+10.84)      57.34(54.66+10.88) +0.1%
  5303.6: repack with kept (1)                57.33(54.80+10.51)      57.38(54.83+10.49) +0.1%
  5303.11: repack (50)                        71.54(88.57+4.84)       71.70(88.99+4.74) +0.2%
  5303.12: repack with kept (50)              85.12(102.05+4.94)      72.58(89.61+4.78) -14.7%
  5303.17: repack (1000)                      216.87(490.79+14.57)    217.19(491.72+14.25) +0.1%
  5303.18: repack with kept (1000)            665.63(938.87+15.76)    246.12(520.07+14.93) -63.0%

and the --stdin-packs timings:

  5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)         0.00(0.00+0.00) -100.0%
  5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)        3.43(11.75+0.24) -2.8%
  5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)     130.50(307.15+7.66) -33.4%

So our repack with an empty .keep pack is roughly as fast as one without
a .keep pack up to 50 packs. But the --stdin-packs case scales a little
better, too.

Notably, it is faster than a repack of the same size and a kept pack. It
looks at fewer objects, of course, but the penalty for looking at many
packs isn't as costly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
fbf20aeeef p5303: measure time to repack with keep
Add two new tests to measure repack performance. Both tests split the
repository into synthetic "pushes", and then leave the remaining objects
in a big base pack.

The first new test marks an empty pack as "kept" and then passes
--honor-pack-keep to avoid including objects in it. That doesn't change
the resulting pack, but it does let us compare to the normal repack case
to see how much overhead we add to check whether objects are kept or
not.

The other test is of --stdin-packs, which gives us a sense of how that
number scales based on the number of packs we provide as input. In each
of those tests, the empty pack isn't considered, but the residual pack
(objects that were left over and not included in one of the synthetic
push packs) is marked as kept.

(Note that in the single-pack case of the --stdin-packs test, there is
nothing do since there are no non-excluded packs).

Here are some timings on a recent clone of the kernel:

  5303.5: repack (1)                          57.26(54.59+10.84)
  5303.6: repack with kept (1)                57.33(54.80+10.51)

in the 50-pack case, things start to slow down:

  5303.11: repack (50)                        71.54(88.57+4.84)
  5303.12: repack with kept (50)              85.12(102.05+4.94)

and by the time we hit 1,000 packs, things are substantially worse, even
though the resulting pack produced is the same:

  5303.17: repack (1000)                      216.87(490.79+14.57)
  5303.18: repack with kept (1000)            665.63(938.87+15.76)

That's because the code paths around handling .keep files are known to
scale badly; they look in every single pack file to find each object.
Our solution to that was to notice that most repos don't have keep
files, and to make that case a fast path. But as soon as you add a
single .keep, that part of pack-objects slows down again (even if we
have fewer objects total to look at).

Likewise, the scaling is pretty extreme on --stdin-packs (but each
subsequent test is also being asked to do more work):

  5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)
  5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)
  5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
60bb5f2f5d p5303: add missing &&-chains
These are in a helper function, so the usual chain-lint doesn't notice
them. This function is still not perfect, as it has some git invocations
on the left-hand-side of the pipe, but it's primary purpose is timing,
not finding bugs or correctness issues.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
339bce27f4 builtin/pack-objects.c: add '--stdin-packs' option
In an upcoming commit, 'git repack' will want to create a pack comprised
of all of the objects in some packs (the included packs) excluding any
objects in some other packs (the excluded packs).

This caller could iterate those packs themselves and feed the objects it
finds to 'git pack-objects' directly over stdin, but this approach has a
few downsides:

  - It requires every caller that wants to drive 'git pack-objects' in
    this way to implement pack iteration themselves. This forces the
    caller to think about details like what order objects are fed to
    pack-objects, which callers would likely rather not do.

  - If the set of objects in included packs is large, it requires
    sending a lot of data over a pipe, which is inefficient.

  - The caller is forced to keep track of the excluded objects, too, and
    make sure that it doesn't send any objects that appear in both
    included and excluded packs.

But the biggest downside is the lack of a reachability traversal.
Because the caller passes in a list of objects directly, those objects
don't get a namehash assigned to them, which can have a negative impact
on the delta selection process, causing 'git pack-objects' to fail to
find good deltas even when they exist.

The caller could formulate a reachability traversal themselves, but the
only way to drive 'git pack-objects' in this way is to do a full
traversal, and then remove objects in the excluded packs after the
traversal is complete. This can be detrimental to callers who care
about performance, especially in repositories with many objects.

Introduce 'git pack-objects --stdin-packs' which remedies these four
concerns.

'git pack-objects --stdin-packs' expects a list of pack names on stdin,
where 'pack-xyz.pack' denotes that pack as included, and
'^pack-xyz.pack' denotes it as excluded. The resulting pack includes all
objects that are present in at least one included pack, and aren't
present in any excluded pack.

To address the delta selection problem, 'git pack-objects --stdin-packs'
works as follows. First, it assembles a list of objects that it is going
to pack, as above. Then, a reachability traversal is started, whose tips
are any commits mentioned in included packs. Upon visiting an object, we
find its corresponding object_entry in the to_pack list, and set its
namehash parameter appropriately.

To avoid the traversal visiting more objects than it needs to, the
traversal is halted upon encountering an object which can be found in an
excluded pack (by marking the excluded packs as kept in-core, and
passing --no-kept-objects=in-core to the revision machinery).

This can cause the traversal to halt early, for example if an object in
an included pack is an ancestor of ones in excluded packs. But stopping
early is OK, since filling in the namehash fields of objects in the
to_pack list is only additive (i.e., having it helps the delta selection
process, but leaving it blank doesn't impact the correctness of the
resulting pack).

Even still, it is unlikely that this hurts us much in practice, since
the 'git repack --geometric' caller (which is introduced in a later
commit) marks small packs as included, and large ones as excluded.
During ordinary use, the small packs usually represent pushes after a
large repack, and so are unlikely to be ancestors of objects that
already exist in the repository.

(I found it convenient while developing this patch to have 'git
pack-objects' report the number of objects which were visited and got
their namehash fields filled in during traversal. This is also included
in the below patch via trace2 data lines).

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
c9fff00016 revision: learn '--no-kept-objects'
A future caller will want to be able to perform a reachability traversal
which terminates when visiting an object found in a kept pack. The
closest existing option is '--honor-pack-keep', but this isn't quite
what we want. Instead of halting the traversal midway through, a full
traversal is always performed, and the results are only trimmed
afterwords.

Besides needing to introduce a new flag (since culling results
post-facto can be different than halting the traversal as it's
happening), there is an additional wrinkle handling the distinction
in-core and on-disk kept packs. That is: what kinds of kept pack should
stop the traversal?

Introduce '--no-kept-objects[=<on-disk|in-core>]' to specify which kinds
of kept packs, if any, should stop a traversal. This can be useful for
callers that want to perform a reachability analysis, but want to leave
certain packs alone (for e.g., when doing a geometric repack that has
some "large" packs which are kept in-core that it wants to leave alone).

Note that this option is not guaranteed to produce exactly the set of
objects that aren't in kept packs, since it's possible the traversal
order may end up in a situation where a non-kept ancestor was "cut off"
by a kept object (at which point we would stop traversing). But, we
don't care about absolute correctness here, since this will eventually
be used as a purely additive guide in an upcoming new repack mode.

Explicitly avoid documenting this new flag, since it is only used
internally. In theory we could avoid even adding it rev-list, but being
able to spell this option out on the command-line makes some special
cases easier to test without promising to keep it behaving consistently
forever. Those tricky cases are exercised in t6114.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
f62312e028 packfile: introduce 'find_kept_pack_entry()'
Future callers will want a function to fill a 'struct pack_entry' for a
given object id but _only_ from its position in any kept pack(s).

In particular, an new 'git repack' mode which ensures the resulting
packs form a geometric progress by object count will mark packs that it
does not want to repack as "kept in-core", and it will want to halt a
reachability traversal as soon as it visits an object in any of the kept
packs. But, it does not want to halt the traversal at non-kept, or
.keep packs.

The obvious alternative is 'find_pack_entry()', but this doesn't quite
suffice since it only returns the first pack it finds, which may or may
not be kept (and the mru cache makes it unpredictable which one you'll
get if there are options).

Short of that, you could walk over all packs looking for the object in
each one, but it scales with the number of packs, which may be
prohibitive.

Introduce 'find_kept_pack_entry()', a function which is like
'find_pack_entry()', but only fills in objects in the kept packs.

Handle packs which have .keep files, as well as in-core kept packs
separately, since certain callers will want to distinguish one from the
other. (Though on-disk and in-core kept packs share the adjective
"kept", it is best to think of the two sets as independent.)

There is a gotcha when looking up objects that are duplicated in kept
and non-kept packs, particularly when the MIDX stores the non-kept
version and the caller asked for kept objects only. This could be
resolved by teaching the MIDX to resolve duplicates by always favoring
the kept pack (if one exists), but this breaks an assumption in existing
MIDXs, and so it would require a format change.

The benefit to changing the MIDX in this way is marginal, so we instead
have a more thorough check here which is explained with a comment.

Callers will be added in subsequent patches.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
966e671106 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 16:12:43 -08:00
d68fccef86 Merge branch 'ab/test-lib'
Test framework clean-up.

* ab/test-lib:
  test-lib-functions: assert correct parameter count
  test-lib-functions: remove bug-inducing "diagnostics" helper param
  test libs: rename "diff-lib" to "lib-diff"
  t/.gitattributes: sort lines
  test-lib-functions: move function to lib-bitmap.sh
  test libs: rename gitweb-lib.sh to lib-gitweb.sh
  test libs: rename bundle helper to "lib-bundle.sh"
  test-lib-functions: remove generate_zero_bytes() wrapper
  test-lib-functions: move test_set_index_version() to its user
  test lib: change "error" to "BUG" as appropriate
  test-lib: remove check_var_migration
2021-02-22 16:12:43 -08:00
45df6c4d75 Merge branch 'ab/diff-deferred-free'
A small memleak in "diff -I<regexp>" has been corrected.

* ab/diff-deferred-free:
  diff: plug memory leak from regcomp() on {log,diff} -I
  diff: add an API for deferred freeing
2021-02-22 16:12:43 -08:00
dcb11fc622 Merge branch 'ab/pager-exit-log'
When a pager spawned by us exited, the trace log did not record its
exit status correctly, which has been corrected.

* ab/pager-exit-log:
  pager: properly log pager exit code when signalled
  run-command: add braces for "if" block in wait_or_whine()
  pager: test for exit code with and without SIGPIPE
  pager: refactor wait_for_pager() function
2021-02-22 16:12:43 -08:00
dc24948be9 Merge branch 'ta/hash-function-transition-doc'
Update formatting and grammar of the hash transition plan
documentation, plus some updates.

* ta/hash-function-transition-doc:
  doc: use https links
  doc hash-function-transition: move rationale upwards
  doc hash-function-transition: fix incomplete sentence
  doc hash-function-transition: use upper case consistently
  doc hash-function-transition: use SHA-1 and SHA-256 consistently
  doc hash-function-transition: fix asciidoc output
2021-02-22 16:12:43 -08:00
15af6e6fee Merge branch 'bc/signed-objects-with-both-hashes'
Signed commits and tags now allow verification of objects, whose
two object names (one in SHA-1, the other in SHA-256) are both
signed.

* bc/signed-objects-with-both-hashes:
  gpg-interface: remove other signature headers before verifying
  ref-filter: hoist signature parsing
  commit: allow parsing arbitrary buffers with headers
  gpg-interface: improve interface for parsing tags
  commit: ignore additional signatures when parsing signed commits
  ref-filter: switch some uses of unsigned long to size_t
2021-02-22 16:12:42 -08:00
b9554c03a0 Merge branch 'dl/stash-cleanup'
Documentation, code and test clean-up around "git stash".

* dl/stash-cleanup:
  stash: declare ref_stash as an array
  t3905: use test_cmp() to check file contents
  t3905: replace test -s with test_file_not_empty
  t3905: remove nested git in command substitution
  t3905: move all commands into test cases
  t3905: remove spaces after redirect operators
  git-stash.txt: be explicit about subcommand options
2021-02-22 16:12:42 -08:00
bf4bb9f9f5 commit-graph: avoid leaking topo_levels slab in write_commit_graph()
write_commit_graph initialises topo_levels using init_topo_level_slab(),
next it calls compute_topological_levels() which can cause the slab to
grow, we therefore need to clear the slab again using
clear_topo_level_slab() when we're done.

First introduced in 72a2bfca (commit-graph: add a slab to store
topological levels, 2021-01-16).

LeakSanitizer output:

==1026==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x498ae9 in realloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xafbed8 in xrealloc /src/git/wrapper.c:126:8
    #2 0x7966d1 in topo_level_slab_at_peek /src/git/commit-graph.c:71:1
    #3 0x7965e0 in topo_level_slab_at /src/git/commit-graph.c:71:1
    #4 0x78fbf5 in compute_topological_levels /src/git/commit-graph.c:1472:12
    #5 0x78c5c3 in write_commit_graph /src/git/commit-graph.c:2456:2
    #6 0x535c5f in graph_write /src/git/builtin/commit-graph.c:299:6
    #7 0x5350ca in cmd_commit_graph /src/git/builtin/commit-graph.c:337:11
    #8 0x4cddb1 in run_builtin /src/git/git.c:453:11
    #9 0x4cabe2 in handle_builtin /src/git/git.c:704:3
    #10 0x4cd084 in run_argv /src/git/git.c:771:4
    #11 0x4ca424 in cmd_main /src/git/git.c:902:19
    #12 0x707fb6 in main /src/git/common-main.c:52:11
    #13 0x7fee4249383f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2083f)

Indirect leak of 524256 byte(s) in 1 object(s) allocated from:
    #0 0x498942 in calloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0xafc088 in xcalloc /src/git/wrapper.c:140:8
    #2 0x796870 in topo_level_slab_at_peek /src/git/commit-graph.c:71:1
    #3 0x7965e0 in topo_level_slab_at /src/git/commit-graph.c:71:1
    #4 0x78fbf5 in compute_topological_levels /src/git/commit-graph.c:1472:12
    #5 0x78c5c3 in write_commit_graph /src/git/commit-graph.c:2456:2
    #6 0x535c5f in graph_write /src/git/builtin/commit-graph.c:299:6
    #7 0x5350ca in cmd_commit_graph /src/git/builtin/commit-graph.c:337:11
    #8 0x4cddb1 in run_builtin /src/git/git.c:453:11
    #9 0x4cabe2 in handle_builtin /src/git/git.c:704:3
    #10 0x4cd084 in run_argv /src/git/git.c:771:4
    #11 0x4ca424 in cmd_main /src/git/git.c:902:19
    #12 0x707fb6 in main /src/git/common-main.c:52:11
    #13 0x7fee4249383f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2083f)

SUMMARY: AddressSanitizer: 524264 byte(s) leaked in 2 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 13:45:01 -08:00
1c881026a1 difftool.c: learn a new way start at specified file
`git difftool` only allow us to select file to view in turn.
If there is a commit with many files and we exit in the middle,
we will have to traverse list again to get the file diff which
we want to see. Therefore,teach the command an option
`--skip-to=<path>` to allow the user to say that diffs for earlier
paths are not interesting (because they were already seen in an
earlier session) and start this session with the named path.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 13:35:49 -08:00
41f3c9949f commit-reach: stale commits may prune generation further
The remove_redundant_with_gen() algorithm performs a depth-first-search
to find commits in the 'array' list, starting at the parents of each
commit in 'array'. The result is that commits in 'array' are marked
STALE when they are reachable from another commit in 'array'.

This depth-first-search is fast when commits lie on or near the
first-parent history of the higher commits. The search terminates early
if all but one commit becomes marked STALE.

However, it is possible that there are two independent commits with high
generation number. In that case, the depth-first-search might languish
by searching in lower generations due to the fixed min_generation used
throughout the method.

With the expectation that commits with lower generation are expected to
become STALE more often, we can optimize further by increasing that
min_generation boundary upon discovery of the commit with minimum
generation.

We must first sort the commits in 'array' by generation. We cannot sort
'array' itself since it must preserve relative order among the returned
results (see revision.c:mark_redundant_parents() for an example).

This simplifies the initialization of min_generation, but it also allows
us to increase the new min_generation when we find the commit with
smallest generation remaining.

This requires more than two commits in order to test, so I used the
Linux kernel repository with a few commits that are slightly off of the
first-parent history. I timed the following command:

  git merge-base --independent 2ecedd756908 d2360a398f0b \
	1253935ad801 160bab43419e 0e2209629fec 1d0e16ac1a9e

The first two commits have similar generation and are near the v5.10
tag. Commit 160bab43419e is off of the first-parent history behind v5.5,
while the others are scattered somewhere reachable from v5.9. This is
designed to demonstrate the optimization, as that commit within v5.5
would normally cause a lot of extra commit walking.

Since remove_redundant_with_alg() is called only when at least one of
the input commits has a finite generation number, this algorithm is
tested with a commit-graph generated starting at a number of different
tags, the earliest being v5.5.

commit-graph at v5.5:

 | Method                | Time  |
 |-----------------------+-------|
 | *_no_gen()            | 864ms |
 | *_with_gen() (before) | 858ms |
 | *_with_gen() (after)  | 810ms |

commit-graph at v5.7:

 | Method                | Time  |
 |-----------------------+-------|
 | *_no_gen()            | 625ms |
 | *_with_gen() (before) | 572ms |
 | *_with_gen() (after)  | 517ms |

commit-graph at v5.9:

 | Method                | Time  |
 |-----------------------+-------|
 | *_no_gen()            | 268ms |
 | *_with_gen() (before) | 224ms |
 | *_with_gen() (after)  | 202ms |

commit-graph at v5.10:

 | Method                | Time  |
 |-----------------------+-------|
 | *_no_gen()            |  72ms |
 | *_with_gen() (before) |  37ms |
 | *_with_gen() (after)  |   9ms |

Note that these are only modest improvements for the case where the two
independent commits are not in the commit-graph (not until v5.10). All
algorithms get faster as more commits are indexed, which is not a
surprise. However, the cost of walking extra commits is more and more
prevalent in relative terms as more commits are indexed. Finally, the
last case allows us to jump to the minimum generation between the last
two commits (that are actually independent) so we greatly reduce the
cost in that case.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 13:34:34 -08:00
3677773371 commit-reach: use heuristic in remove_redundant()
Reachability algorithms in commit-reach.c frequently benefit from using
the first-parent history as a heuristic for satisfying reachability
queries. The most obvious example was implemented in 4fbcca4e
(commit-reach: make can_all_from_reach... linear, 2018-07-20).

Update the walk in remove_redundant() to use this same heuristic. Here,
we are walking starting at the parents of the input commits. Sort those
parents and walk from the highest generation to lower. Each time, use
the heuristic of searching the first parent history before continuing to
expand the walk.

The order in which we explore the commits matters, so update
compare_commits_by_gen to break generation number ties with commit date.
This has no effect when the commits are in a commit-graph file with
corrected commit dates computed, but it will assist when the commits are
in the region "above" the commit-graph with "infinite" generation
number. Note that we cannot shift to use
compare_commits_by_gen_then_commit_date as the method prototype is
different. We use compare_commits_by_gen for QSORT() as opposed to as a
priority function.

The important piece is to ensure we short-circuit the walk when we find
that there is a single non-redundant commit. This happens frequently
when looking for merge-bases or comparing several tags with 'git
merge-base --independent'. Use a new count 'count_still_independent' and
if that hits 1 we can stop walking.

To update 'count_still_independent' properly, we add use of the RESULT
flag on the input commits. Then we can detect when we reach one of these
commits and decrease the count. We need to remove the RESULT flag at
that moment because we might re-visit that commit when popping the
stack.

We use the STALE flag to mark parents that have been added to the new
walk_start list, but we need to clear that flag before we start walking
so those flags don't halt our depth-first-search walk.

On my copy of the Linux kernel repository, the performance of 'git
merge-base --independent <all-tags>' goes from 1.1 seconds to 0.11
seconds.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 13:34:34 -08:00
c8d693e1e6 commit-reach: move compare_commits_by_gen
Move this earlier in the file so it can be used by more methods.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 13:34:34 -08:00
fbc21e3fbb commit-reach: use one walk in remove_redundant()
The current implementation of remove_redundant() uses several calls to
paint_down_to_common() to determine that commits are independent of each
other. This leads to quadratic behavior when many inputs are passed to
commands such as 'git merge-base'.

For example, in the Linux kernel repository, I tested the performance
by passing all tags:

 git merge-base --independent $(git for-each-ref refs/tags --format="$(refname)")

(Note: I had to delete the tags v2.6.11-tree and v2.6.11 as they do
not point to commits.)

Here is the performance improvement introduced by this change:

 Before: 16.4s
  After:  1.1s

This performance improvement requires the commit-graph file to be
present. We keep the old algorithm around as remove_redundant_no_gen()
and use it when generation_numbers_enabled() is false. This is similar
to other algorithms within commit-reach.c. The new algorithm is
implemented in remove_redundant_with_gen().

The basic approach is to do one commit walk instead of many. First, scan
all commits in the list and mark their _parents_ with the STALE flag.
This flag will indicate commits that are reachable from one of the
inputs, except not including themselves. Then, walk commits until
covering all commits up to the minimum generation number pushing the
STALE flag throughout.

At the end, we need to clear the STALE bit from all of the commits
we walked. We move the non-stale commits in 'array' to the beginning of
the list, and this might overwrite stale commits. However, we store an
array of commits that started the walk, and use clear_commit_marks() on
each of those starting commits. That method will walk the reachable
commits with the STALE bit and clear them all. This makes the algorithm
safe for re-entry or for other uses of those commits after this walk.

This logic is covered by tests in t6600-test-reach.sh, so the behavior
does not change. This is tested both in the case with a commit-graph and
without.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 13:34:34 -08:00
3a837b58e3 doc: mention bigFileThreshold for packing
Knowing about the core.bigFileThreshold configuration variable is
helpful when examining pack file size differences between repositories.
Add a reference to it to the manpages a user is likely to read in this
situation.

Capitalize CONFIGURATION for consistency with other pages having such a
section.

Signed-off-by: Christian Walther <cwalther@gmx.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 13:18:30 -08:00
5476e1efde fetch-pack: print and use dangling .gitmodules
Teach index-pack to print dangling .gitmodules links after its "keep" or
"pack" line instead of declaring an error, and teach fetch-pack to check
such lines printed.

This allows the tree side of the .gitmodules link to be in one packfile
and the blob side to be in another without failing the fsck check,
because it is now fetch-pack which checks such objects after all
packfiles have been downloaded and indexed (and not index-pack on an
individual packfile, as it is before this commit).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 12:07:40 -08:00
b664e9ffa1 fetch-pack: with packfile URIs, use index-pack arg
Unify the index-pack arguments used when processing the inline pack and
when downloading packfiles referenced by URIs. This is done by teaching
get_pack() to also store the index-pack arguments whenever at least one
packfile URI is given, and then when processing the packfile URI(s),
using the stored arguments.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 12:07:40 -08:00
27e35ba6c6 http-fetch: allow custom index-pack args
This is the next step in teaching fetch-pack to pass its index-pack
arguments when processing packfiles referenced by URIs.

The "--keep" in fetch-pack.c will be replaced with a full message in a
subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 12:07:40 -08:00
726b25a91b http: allow custom index-pack args
Currently, when fetching, packfiles referenced by URIs are run through
index-pack without any arguments other than --stdin and --keep, no
matter what arguments are used for the packfile that is inline in the
fetch response. As a preparation for ensuring that all packs (whether
inline or not) use the same index-pack arguments, teach the http
subsystem to allow custom index-pack arguments.

http-fetch has been updated to use the new API. For now, it passes
--keep alone instead of --keep with a process ID, but this is only
temporary because http-fetch itself will be taught to accept index-pack
parameters (instead of using a hardcoded constant) in a subsequent
commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 12:07:40 -08:00
b1056f60b6 Merge branch 'py/commit-comments'
Use git-stripspace to remove comment lines from the commit message. Also
use it to clean up whitespace instead of rolling our own logic.

* py/commit-comments:
  git-gui: remove lines starting with the comment character
2021-02-22 20:19:53 +05:30
1b5b8cf072 Documentation: typofix --column description
f4ed0af6 (Merge branch 'nd/columns', 2012-05-03) brought in three
cut-and-pasted copies of malformatted descriptions.  Let's fix them
all the same way by marking the configuration variable names up as
monospace just like the command line option `--column` is typeset.

While we are at it, correct a missing space after the full stop that
ends the sentence.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-19 19:36:47 -08:00
a43a2e6c2a chunk-format: add technical docs
The chunk-based file format is now an API in the code, but we should
also take time to document it as a file format. Specifically, it matches
the CHUNK LOOKUP sections of the commit-graph and multi-pack-index
files, but there are some commonalities that should be grouped in this
document.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
5387fefadc chunk-format: restore duplicate chunk checks
Before refactoring into the chunk-format API, the commit-graph parsing
logic included checks for duplicate chunks. It is unlikely that we would
desire a chunk-based file format that allows duplicate chunk IDs in the
table of contents, so add duplicate checks into
read_table_of_contents().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
329fac3a36 midx: use 64-bit multiplication for chunk sizes
When calculating the sizes of certain chunks, we should use 64-bit
multiplication always. This allows us to properly predict the chunk
sizes without risk of overflow.

Other possible overflows were discovered by evaluating each
multiplication in midx.c and ensuring that at least one side of the
operator was of type size_t or off_t.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
6ab3b8b8b8 midx: use chunk-format read API
Instead of parsing the table of contents directly, use the chunk-format
API methods read_table_of_contents() and pair_chunk(). In particular, we
can use the return value of pair_chunk() to generate an error when a
required chunk is missing.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
2692c2f6fd commit-graph: use chunk-format read API
Instead of parsing the table of contents directly, use the chunk-format
API methods read_table_of_contents() and pair_chunk(). While the current
implementation loses the duplicate-chunk detection, that will be added
in a future change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
5f0879f54b chunk-format: create read chunk API
Add the capability to read the table of contents, then pair the chunks
with necessary logic using read_chunk_fn pointers. Callers will be added
in future changes, but the typical outline will be:

 1. initialize a 'struct chunkfile' with init_chunkfile(NULL).
 2. call read_table_of_contents().
 3. for each chunk to parse,
    a. call pair_chunk() to assign a pointer with the chunk position, or
    b. call read_chunk() to run a callback on the chunk start and size.
 4. call free_chunkfile() to clear the 'struct chunkfile' data.

We are re-using the anonymous 'struct chunkfile' data, as it is internal
to the chunk-format API. This gives it essentially two modes: write and
read. If the same struct instance was used for both reads and writes,
then there would be failures.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
63a8f0e9b9 midx: use chunk-format API in write_midx_internal()
The chunk-format API allows writing the table of contents and all chunks
using the anonymous 'struct chunkfile' type. We only need to convert our
local chunk logic to this API for the multi-pack-index writes to share
that logic with the commit-graph file writes.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
c1442410d8 midx: drop chunk progress during write
Most expensive operations in write_midx_internal() use the context
struct's progress member, and these indicate the process of the
expensive operations within the chunk writing methods. However, there is
a competing progress struct that counts the progress over all chunks.
This is not very helpful compared to the others, so drop it.

This also reduces our barriers to combining the chunk writing code with
chunk-format.c.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
0ccd713cb6 midx: return success/failure in chunk write methods
Historically, the chunk-writing methods in midx.c have returned the
amount of data written so the writer method could compare this with the
table of contents. This presents with some interesting issues:

1. If a chunk writing method has a bug that miscalculates the written
   bytes, then we can satisfy the table of contents without actually
   writing the right amount of data to the hashfile. The commit-graph
   writing code checks the hashfile struct directly for a more robust
   verification.

2. There is no way for a chunk writing method to gracefully fail.
   Returning an int presents an opportunity to fail without a die().

3. The current pattern doesn't match chunk_write_fn type exactly, so we
   cannot share code with commit-graph.c

For these reasons, convert the midx chunk writer methods to return an
'int'. Since none of them fail at the moment, they all return 0.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
980f525c3c midx: add num_large_offsets to write_midx_context
In an effort to align write_midx_internal() with the chunk-format API,
continue to group necessary data into "struct write_midx_context". This
change collects the "uint32_t num_large_offsets" into the context. With
this new data, write_midx_large_offsets() now matches the
chunk_write_fn type.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
7a3ada1192 midx: add pack_perm to write_midx_context
In an effort to align write_midx_internal() with the chunk-format API,
continue to group necessary data into "struct write_midx_context". This
change collects the "uint32_t *pack_perm" and large_offsets_needed bit
into the context.

Update write_midx_object_offsets() to match chunk_write_fn.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
31bda9a237 midx: add entries to write_midx_context
In an effort to align write_midx_internal() with the chunk-format API,
continue to group necessary data into "struct write_midx_context". This
change collects the "struct pack_midx_entry *entries" list and its count
into the context.

Update write_midx_oid_fanout() and write_midx_oid_lookup() to take the
context directly, as these are easy conversions with this new data.

Only the callers of write_midx_object_offsets() and
write_midx_large_offsets() are updated here, since additional data in
the context before those methods can match chunk_write_fn.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
b4d941420b midx: use context in write_midx_pack_names()
In an effort to align the write_midx_internal() to use the chunk-format
API, start converting chunk writing methods to match chunk_write_fn. The
first case is to convert write_midx_pack_names() to take "void *data".
We already have the necessary data in "struct write_midx_context", so
this conversion is rather mechanical.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
577dc49696 midx: rename pack_info to write_midx_context
In an effort to streamline our chunk-based file formats, align some of
the code structure in write_midx_internal() to be similar to the
patterns in write_commit_graph_file().

Specifically, let's create a "struct write_midx_context" that can be
used as a data parameter to abstract function types.

This change only renames "struct pack_info" to "struct
write_midx_context" and the names of instances from "packs" to "ctx". In
future changes, we will expand the data inside "struct
write_midx_context" and align our chunk-writing method with the
chunk-format API.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
47410aa837 commit-graph: use chunk-format write API
The commit-graph write logic is ready to make use of the chunk-format
write API. Each chunk write method is already in the correct prototype.
We only need to use the 'struct chunkfile' pointer and the correct API
calls.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
570df42610 chunk-format: create chunk format write API
In anticipation of combining the logic from the commit-graph and
multi-pack-index file formats, create a new chunk-format API. Use a
'struct chunkfile' pointer to keep track of data that has been
registered for writes. This struct is anonymous outside of
chunk-format.c to ensure no user attempts to interfere with the data.

The next change will use this API in commit-graph.c, but the general
approach is:

 1. initialize the chunkfile with init_chunkfile(f).
 2. add chunks in the intended writing order with add_chunk().
 3. write any header information to the hashfile f.
 4. write the chunkfile data using write_chunkfile().
 5. free the chunkfile struct using free_chunkfile().

Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
f89f46b704 gitmailmap.txt: fix rendering of e-mail addresses
Both AsciiDoc and Asciidoctor are eager to pick up the e-mail addresses
in this document and turn them into references at the bottom of the
manpage / clickable links. We don't really need that for these dummy
addresses. Spell "@" as "&#64;" to make them not do this. In the open
block, we can instead avoid this by indenting the contents, similar to
the earlier blocks.

Fix a backtick which should have been a single quote mark. With all the
quoting that is going on around here, this mistake trips up the parsing
and rendering quite a bit.

Before this commit, we have the same failure mode with AsciiDoc 8.6.10
and Asciidoctor 1.5.5, and this change makes both of them happy.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 10:53:33 -08:00
83171ede22 git.txt: fix monospace rendering
When we write `<name>`s with the "s" tucked on to the closing backtick,
we end up rendering the backticks literally. Rephrase this sentence
slightly to render this as monospace.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 10:53:33 -08:00
b9a43869c9 git-gui: remove lines starting with the comment character
The comment character is specified by the config variable
'core.commentchar'. Any lines starting with this character is considered
a comment and should not be included in the final commit message.

Teach git-gui to filter out lines in the commit message that start with
the comment character using git-stripspace. If the config is not set,
'#' is taken as the default. Also add a message educating users about
the comment character.

Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2021-02-18 23:35:57 +05:30
2283e0e9af The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 17:21:43 -08:00
483e09e810 Merge branch 'ak/config-bad-bool-error'
The error message given when a configuration variable that is
expected to have a boolean value has been improved.

* ak/config-bad-bool-error:
  config: improve error message for boolean config
2021-02-17 17:21:43 -08:00
e68f62be8d Merge branch 'js/reflog-expire-stale-fix'
"git reflog expire --stale-fix" can be used to repair the reflog by
removing entries that refer to objects that have been pruned away,
but was not careful to tolerate missing objects.

* js/reflog-expire-stale-fix:
  reflog expire --stale-fix: be generous about missing objects
2021-02-17 17:21:43 -08:00
726b11d68a Merge branch 'js/commit-graph-warning'
When certain features (e.g. grafts) used in the repository are
incompatible with the use of the commit-graph, we used to silently
turned commit-graph off; we now tell the user what we are doing.

* js/commit-graph-warning:
  commit-graph: when incompatible with graphs, indicate why
2021-02-17 17:21:42 -08:00
e9b4c483c7 Merge branch 'ew/rev-parse-since-test'
Test to make sure "git rev-parse one-thing one-thing" gives
the same thing twice (when one-thing is --since=X).

* ew/rev-parse-since-test:
  t1500: ensure current --since= behavior remains
2021-02-17 17:21:42 -08:00
d494433d26 Merge branch 'ds/maintenance-pack-refs'
"git maintenance" tool learned a new "pack-refs" maintenance task.

* ds/maintenance-pack-refs:
  maintenance: incremental strategy runs pack-refs weekly
  maintenance: add pack-refs task
2021-02-17 17:21:42 -08:00
fdf3a27ca9 Merge branch 'jx/t5411-unique-filenames'
Avoid individual tests in t5411 from getting affected by each other
by forcing them to use separate output files during the test.

* jx/t5411-unique-filenames:
  t5411: refactor check of refs using test_cmp_refs
  t5411: use different out file to prevent overwriting
2021-02-17 17:21:42 -08:00
9e634a91c8 Merge branch 'js/fsck-name-objects-fix'
Fix "git fsck --name-objects" which apparently has not been used by
anybody who is motivated enough to report breakage.

* js/fsck-name-objects-fix:
  fsck --name-objects: be more careful parsing generation numbers
  t1450: robustify `remove_object()`
2021-02-17 17:21:42 -08:00
9bdccbcda7 Merge branch 'jk/mailmap-only-at-root'
The .mailmap is documented to be read only from the root level of a
working tree, but a stray file in a bare repository also was read
by accident, which has been corrected.

* jk/mailmap-only-at-root:
  mailmap: only look for .mailmap in work tree
2021-02-17 17:21:42 -08:00
f712632a51 Merge branch 'mt/grep-cached-untracked'
"git grep --untracked" is meant to be "let's ALSO find in these
files on the filesystem" when looking for matches in the working
tree files, and does not make any sense if the primary search is
done against the index, or the tree objects.  The "--cached" and
"--untracked" options have been marked as mutually incompatible.

* mt/grep-cached-untracked:
  grep: error out if --untracked is used with --cached
2021-02-17 17:21:41 -08:00
78a26cb720 Merge branch 'sh/mergetool-hideresolved'
"git mergetool" feeds three versions (base, local and remote) of
a conflicted path unmodified.  The command learned to optionally
prepare these files with unconflicted parts already resolved.

* sh/mergetool-hideresolved:
  mergetool: add per-tool support and overrides for the hideResolved flag
  mergetool: break setup_tool out into separate initialization function
  mergetool: add hideResolved configuration
2021-02-17 17:21:41 -08:00
aa2d3dbdf5 Merge branch 'jt/trace2-BUG'
Even though invocations of "die()" were logged to the trace2
system, "BUG()"s were not, which has been corrected.

* jt/trace2-BUG:
  usage: trace2 BUG() invocations
2021-02-17 17:21:41 -08:00
dadc91ff0c Merge branch 'js/range-diff-one-side-only'
The "git range-diff" command learned "--(left|right)-only" option
to show only one side of the compared range.

* js/range-diff-one-side-only:
  range-diff: offer --left-only/--right-only options
  range-diff: move the diffopt initialization down one layer
  range-diff: combine all options in a single data structure
  range-diff: simplify code spawning `git log`
  range-diff: libify the read_patches() function again
  range-diff: avoid leaking memory in two error code paths
2021-02-17 17:21:41 -08:00
77348b0e6e Merge branch 'js/range-diff-wo-dotdot'
There are other ways than ".." for a single token to denote a
"commit range", namely "<rev>^!" and "<rev>^-<n>", but "git
range-diff" did not understand them.

* js/range-diff-wo-dotdot:
  range-diff(docs): explain how to specify commit ranges
  range-diff/format-patch: handle commit ranges other than A..B
  range-diff/format-patch: refactor check for commit range
2021-02-17 17:21:41 -08:00
69571dfe21 Merge branch 'jt/clone-unborn-head'
"git clone" tries to locally check out the branch pointed at by
HEAD of the remote repository after it is done, but the protocol
did not convey the information necessary to do so when copying an
empty repository.  The protocol v2 learned how to do so.

* jt/clone-unborn-head:
  clone: respect remote unborn HEAD
  connect, transport: encapsulate arg in struct
  ls-refs: report unborn targets of symrefs
2021-02-17 17:21:40 -08:00
0871fb9af5 Merge branch 'mr/bisect-in-c-4'
Piecemeal of rewrite of "git bisect" in C continues.

* mr/bisect-in-c-4:
  bisect--helper: retire `--check-and-set-terms` subcommand
  bisect--helper: reimplement `bisect_skip` shell function in C
  bisect--helper: retire `--bisect-auto-next` subcommand
  bisect--helper: use `res` instead of return in BISECT_RESET case option
  bisect--helper: retire `--bisect-write` subcommand
  bisect--helper: reimplement `bisect_replay` shell function in C
  bisect--helper: reimplement `bisect_log` shell function in C
2021-02-17 17:21:40 -08:00
5bd0b21bf7 Merge branch 'ds/commit-graph-genno-fix'
Fix incremental update of commit-graph file around corrected commit
date data.

* ds/commit-graph-genno-fix:
  commit-graph: prepare commit graph
  commit-graph: be extra careful about mixed generations
  commit-graph: compute generations separately
  commit-graph: validate layers for generation data
  commit-graph: always parse before commit_graph_data_at()
  commit-graph: use repo_parse_commit
2021-02-17 17:21:40 -08:00
8b4701ae4f Merge branch 'ak/corrected-commit-date'
The commit-graph learned to use corrected commit dates instead of
the generation number to help topological revision traversal.

* ak/corrected-commit-date:
  doc: add corrected commit date info
  commit-reach: use corrected commit dates in paint_down_to_common()
  commit-graph: use generation v2 only if entire chain does
  commit-graph: implement generation data chunk
  commit-graph: implement corrected commit date
  commit-graph: return 64-bit generation number
  commit-graph: add a slab to store topological levels
  t6600-test-reach: generalize *_three_modes
  commit-graph: consolidate fill_commit_graph_info
  revision: parse parent in indegree_walk_step()
  commit-graph: fix regression when computing Bloom filters
2021-02-17 17:21:40 -08:00
c1760352e0 grep/pcre2: move definitions of pcre2_{malloc,free}
Move the definitions of the pcre2_{malloc,free} functions above the
compile_pcre2_pattern() function they're used in.

Before the preceding commit they used to be needed earlier, but now we
can move them to be adjacent to the other PCREv2 functions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
cbe81e653f grep/pcre2: move back to thread-only PCREv2 structures
Change the setup of the "pcre2_general_context" to happen per-thread
in compile_pcre2_pattern() instead of in grep_init().

This change brings it in line with how the rest of the pcre2_* members
in the grep_pat structure are set up.

As noted in the preceding commit the approach 513f2b0bbd (grep: make
PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
pcre2_general_context seems to have been initially based on a
misunderstanding of how PCREv2 memory allocation works.

The approach of creating a global context in grep_init() is just added
complexity for almost zero gain. On my system it's 24 bytes saved
per-thread. For comparison PCREv2 will then go on to allocate at least
a kilobyte for its own thread-local state.

As noted in 6d423dd542 (grep: don't redundantly compile throwaway
patterns under threading, 2017-05-25) the grep code is intentionally
not trying to micro-optimize allocations by e.g. sharing some PCREv2
structures globally, while making others thread-local.

So let's remove this special case and make all of them thread-local
again for simplicity. With this change we could move the
pcre2_{malloc,free} functions around to live closer to their current
use. I'm not doing that here to keep this change small, that cleanup
will be done in a follow-up commit.

See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01) about thread safety, and Johannes's comments[1] to the
effect that we should be doing what this patch is doing.

1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
8d12851342 grep/pcre2: actually make pcre2 use custom allocator
Continue work started in 513f2b0bbd (grep: make PCRE2 aware of custom
allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
functions for allocation. We'll now use it for all PCREv2 allocations.

The reason 513f2b0bbd worked as a bugfix for the USE_NED_ALLOCATOR
issue is because it targeted the allocation freed via free(), as
opposed to by a pcre2_*free() function. I.e. the pcre2_maketables()
and pcre2_maketables_free() pair.

For most of the rest we continued allocating with stock malloc()
inside PCREv2 itself, but didn't segfault because we'd use its
corresponding free().

In a preceding commit of mine I changed the free() to
pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
far as fixing the segfault goes we could revert 513f2b0bbd. But then
we wouldn't use the desired allocator, let's just use it instead.

Before this patch we'd on e.g.:

    grep --threads=1 -iP æ.*var.*xyz

Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
corresponding free() calls. Now it's 12 calls to each. This can be
observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.

Reading the history of how this bug got introduced it wasn't present
in Johannes's original patch[1] to fix the issue.

My reading of that thread is that the approach the follow-up patches
to Johannes's original pursued were based on misunderstanding of how
the PCREv2 API works. In particular this part of [2]:

    "most of the time (like when using UTF-8) the chartable (and
    therefore the global context) is not needed (even when using
    alternate allocators)"

That's simply not how PCREv2 memory allocation works. It's easy to see
how the misunderstanding came about. It's because (as noted above) the
issue was noticed because of our use of free() in our own grep.c for
freeing the memory allocated by pcre2_maketables().

Thus the misunderstanding that PCREv2's compile context is something
only needed for pcre2_maketables(), and e.g. an aborted earlier
attempt[3] to only set it up when we ourselves called
pcre2_maketables().

That's not what PCREv2's compile context is. To quote PCREv2's
documentation:

    "This context just contains pointers to (and data for) external
    memory management functions that are called from several places in
    the PCRE2 library."

Thus the failed attempts to go down the route of only creating the
general context in cases where we ourselves call pcre2_maketables(),
before finally settling on the approach 513f2b0bbd took of always
creating it, but then mostly not using it.

Instead we should always create it, and then pass the general context
to those functions that accept it, so that they'll consistently use
our preferred memory allocation functions.

1. https://lore.kernel.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
b76bf27f6a grep/pcre2: use pcre2_maketables_free() function
Make use of the pcre2_maketables_free() function to free the memory
allocated by pcre2_maketables().

At first sight it's strange that 10da030ab7 (grep: avoid leak of
chartables in PCRE2, 2019-10-16) which added the free() call here
doesn't make use of the pcre2_free() the author introduced in the
preceding commit in 513f2b0bbd (grep: make PCRE2 aware of custom
allocator, 2019-10-16).

The reason is that at the time the function didn't exist. It was first
introduced in PCREv2 version 10.34, released on 2019-11-21.

Let's make use of it behind a macro. I don't think this matters for
anything to do with custom allocators, but it makes our use of PCREv2
more discoverable.

At some distant point in the future we'll be able to drop the version
guard, as nobody will be running a version older than 10.34.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
797c359978 grep/pcre2: use compile-time PCREv2 version test
Replace a use of pcre2_config(PCRE2_CONFIG_VERSION, ...) which I added
in 95ca1f987e (grep/pcre2: better support invalid UTF-8 haystacks,
2021-01-24) with the same test done at compile-time.

It might be cuter to do this at runtime since we don't have to do the
"major >= 11 || (major >= 10 && ...)" test. But in the next commit
we'll add another version comparison that absolutely needs to be done
at compile-time, so we're better of being consistent across the board.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
a39b4003f0 grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
Add optional printing of PCREv2 allocations to stderr for a developer
who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to "1".

You need to manually change the definition in the source file similar
to the DEBUG_MAILMAP, there's no Makefile knob for this.

This will be referenced a subsequent commit, and is generally useful
to manually see what's going on with PCREv2 allocations while working
on that code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
588e4fb191 grep/pcre2: prepare to add debugging to pcre2_malloc()
Change pcre2_malloc() in a way that'll make it easier for a debugging
fprintf() to spew out the allocated pointer.

This doesn't introduce any functional change, it just makes a
subsequent commit's diff easier to read. Changes code added in
513f2b0bbd (grep: make PCRE2 aware of custom allocator, 2019-10-16).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
47eebd2fd2 grep/pcre2: correct reference to grep_init() in comment
Correct a comment added in 513f2b0bbd (grep: make PCRE2 aware of
custom allocator, 2019-10-16). This comment was never correct in
git.git, but was consistent with an older version of the patch[1].

1. https://lore.kernel.org/git/20190806163658.66932-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:18 -08:00
1cfc5a850c grep/pcre2: drop needless assignment to NULL
Remove a redundant assignment of pcre2_compile_context dating back to
my 94da9193a6 (grep: add support for PCRE v2, 2017-06-01).

In create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
need to NULL out individual members here.

I think this was probably something left over from an earlier
development version of mine.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:18 -08:00
0ddf8ceac0 grep/pcre2: drop needless assignment + assert() on opt->pcre2
Drop an assignment added in b65abcafc7 (grep: use PCRE v2 for
optimized fixed-string search, 2019-07-01) and the overly cautious
assert() I added in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01).

There was never a good reason for this, it's just a relic from when I
initially wrote the PCREv2 support. We're not going to have confusion
about compile_pcre2_pattern() being called when it shouldn't just
because we forgot to cargo-cult this opt->pcre2 option.

Furthermore the "struct grep_opt" is (mostly) used for the options the
user supplied, let's avoid the pattern of needlessly assigning to it.

With my recent removal of the PCREv1 backend in 7599730b7e (Remove
support for v1 of the PCRE library, 2021-01-24) there's even less
confusion around what we call where in these codepaths, which is one
more reason to remove this.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:18 -08:00
9d336655ba doc: fix naming of response-end-pkt
Git Protocol version 2[1] defines 0002 as a Message Packet that indicates
the end of a response for stateless connections.

Change the naming of the 0002 Packet to 'Response End' to match the
parsing introduced in Wireshark's MR !1922 for consistency. A subsequent
MR in Wireshark will address additional mismatches.

[1] kernel.org/pub/software/scm/git/docs/technical/protocol-v2.html
[2] gitlab.com/wireshark/wireshark/-/merge_requests/1922

Signed-off-by: Joey Salazar <jgsal@protonmail.com>
Reviewed-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:30:43 -08:00
a1db097e10 docs/rev-list: add some examples of --disk-usage
It's not immediately obvious why --disk-usage might be a useful thing.
These examples show off a few of the real-world cases I've used it for.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:25:29 -08:00
669b458755 docs/rev-list: add an examples section
We currently don't show any examples of using git-rev-list at all. Let's
add some pretty elementary examples. They likely seem obvious to anybody
who has worked with the tool for a while, but my purpose here is
two-fold:

  - they may be enlightening to people who haven't used the tool a lot
    to give a general flavor of how it is meant to be used

  - they can serve as a starting point for adding more interesting
    examples (we can do that without the basic ones, of course, but I
    think it makes sense to show off the building blocks)

This set is far from exhaustive, but again, the purpose is to be a
starting point for further additions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:22:13 -08:00
452d26448d rev-list-options.txt: fix rendering of bonus paragraph
In git-log(1) -- but not in git-shortlog(1) or git-rev-list(1) -- we
include a bonus paragraph in the description of `--first-parent`. But
we forgot to add a lone "+" for a list continuation, and we shouldn't
be indenting this second paragraph. As a result, we get a different
indentation and the `backticks` render literally.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 13:16:11 -08:00
8e16effe97 blame: remove unnecessary use of get_commit_info()
When `git blame --color-by-age`, the determine_line_heat() is called to
select how to color the output based on the commit's author date.  It
uses the get_commit_info() to parse the information into a `commit_info`
structure, however, this is actually unnecessary because the
determine_line_heat() caller also does the same.

Instead, let's change the determine_line_heat() to take a `commit_info`
structure and remove the internal call to get_commit_info() thus
cleaning up and optimizing the code path.

Enabling Git's trace2 API in order to record the execution time for
every call to determine_line_heat() function:

   + trace2_region_enter("blame", "determine_line_heat", the_repository);
     determine_line_heat(ent, &default_color);
   + trace2_region_enter("blame", "determine_line_heat", the_repository);

Then, running `git blame` for "kernel/fork.c" in linux.git and summing
all the execution time for every call (around 1.3k calls) resulted in
2.6x faster execution (best out 3):

   git built from 328c109303 (The eighth batch, 2021-02-12) = 42ms
   git built from 328c109303 + this change                  = 16ms

Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 11:04:17 -08:00
b081547ec1 pretty: add merge and exclude options to %(describe)
Allow restricting the tags used by the placeholder %(describe) with the
options match and exclude.  E.g. the following command describes the
current commit using official version tags, without those for release
candidates:

   $ git log -1 --format='%(describe:match=v[0-9]*,exclude=*rc*)'

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 09:54:33 -08:00
15ae82d5d6 pretty: add %(describe)
Add a format placeholder for describe output.  Implement it by actually
calling git describe, which is simple and guarantees correctness.  It's
intended to be used with $Format:...$ in files with the attribute
export-subst and git archive.  It can also be used with git log etc.,
even though that's going to be slow due to the fork for each commit.

Suggested-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 09:54:31 -08:00
fcd19b09f8 fsmonitor: refactor initialization of fsmonitor_last_update token
Isolate and document initialization of `istate->fsmonitor_last_update`.
This field should contain a fsmonitor-specific opaque token, but we
need to initialize it before we can actually talk to a fsmonitor process,
so we create a generic default value.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:35 -08:00
ff03836b9d fsmonitor: allow all entries for a folder to be invalidated
Allow fsmonitor to report directory changes by reporting paths with a
trailing slash.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:35 -08:00
29fbbf43a0 fsmonitor: log FSMN token when reading and writing the index
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:35 -08:00
940b94f35c fsmonitor: log invocation of FSMonitor hook to trace2
Let's measure the time taken to request and receive FSMonitor data
via the hook API and the size of the response.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
15268d12be read-cache: log the number of scanned files to trace2
Report the number of files in the working directory that were read and
their hashes verified in `refresh_index()`.

FSMonitor improves the performance of commands like `git status` by
avoiding scanning the disk for changed files.  Let's measure this.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
a98e0f2d31 read-cache: log the number of lstat calls to trace2
Report the total number of calls made to lstat() inside of refresh_index().

FSMonitor improves the performance of commands like `git status` by
avoiding scanning the disk for changed files.  This can be seen in
`refresh_index()`.  Let's measure this.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
8c4b7503d0 preload-index: log the number of lstat calls to trace2
Report the total number of calls made to lstat() inside preload_index().

FSMonitor improves the performance of commands like `git status` by
avoiding scanning the disk for changed files.  This can be seen in
`preload_index()`.  Let's measure this.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
4f2009dce2 p7519: add trace logging during perf test
Add optional trace logging to allow us to better compare performance of
various fsmonitor providers and compare results with non-fsmonitor runs.

Currently, this includes Trace2 logging, but may be extended to include
other trace targets, such as GIT_TRACE_FSMONITOR if desired.

Using this logging helped me explain an odd behavior on MacOS where the
kernel was dropping events and causing the hook to Watchman to timeout.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
a7556c3bde p7519: move watchman cleanup earlier in the test
Shutdown Watchman after the Watchman-based tests and before the block of
"no fsmonitor" tests.

This helps ensure that Watchman cannot affect the test results for the
other.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
0917763d67 p7519: fix watchman watch-list test on Windows
Only use the final portion of the test trash directory file name
when verifying that Watchman was started.

On Windows and under the SDK, $GIT_WORKTREE is a cygwin-style
path with forward slashes and a "/c/" drive name.  However
`watchman watch-list` reports a proper Windows-style pathname
with drive letters and backslashes.  This causes the grep to
fail.  Since we don't really care about the full pathname (and
we really don't want to bother with normalizaing them), just see
if the test-name portion of the path is found.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
eb10e637cf p7519: do not rely on "xargs -d" in test
Convert the test to use a more portable method to update the mtime on a
large number of files under version control.

The Mac version of xargs does not support the "-d" option.
Likewise, the "-0" and "--null" options are not portable.

Furthermore, use `test-tool chmtime` rather than `touch` to update the
mtime to ensure that it is actually updated (especially on file systems
with only whole second resolution).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
3f7ba60350 checkout-index: omit entries with no tempname from --temp output
With --temp (or --stage=all, which implies --temp), checkout-index
writes a list to stdout associating temporary file names to the entries'
names. But if it fails to write an entry, and the failure happens before
even assigning a temporary filename to that entry, we get an odd output
line. This can be seen when trying to check out a symlink whose blob is
missing:

$ missing_blob=$(git hash-object --stdin </dev/null)
$ git update-index --add --cacheinfo 120000,$missing_blob,foo
$ git checkout-index --temp foo
error: unable to read sha1 file of foo (e69de29bb2)
        foo

The 'TAB foo' line is not much useful and it might break scripts that
expect the 'tempname TAB foo' output. So let's omit such entries from
the stdout list (but leaving the error message on stderr).

We could also consider omitting _all_ failed entries from the output
list, but that's probably not a good idea as the associated tempfiles
may have been created even when checkout failed, so scripts may want to
use the output list for cleanup.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 11:27:18 -08:00
9334ea8e92 write_entry(): fix misuses of path in error messages
The variables `path` and `ce->name`, at write_entry(), usually have the
same contents, but that's not the case when using a checkout prefix or
writing to a tempfile. (In fact, `path` will be either empty or dirty
when writing to a tempfile.) Therefore, these variables cannot be used
interchangeably. In this sense, fix wrong uses of `path` in error
messages where it should really be `ce->name`, and add some regression
tests. (Note: there doesn't seem to be any misuse in the other way
around.)

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 11:27:17 -08:00
adcd9f5472 mailmap: do not respect symlinks for in-tree .mailmap
As with .gitattributes and .gitignore, we would like to make sure that
.mailmap files are handled consistently whether read from the a blob (as
is the default behavior in a bare repo) or from the filesystem.
Likewise, we would like to avoid reading out-of-tree files pointed to by
a symlink, which could have security implications in certain setups.

We can cover both by using open_nofollow() when opening the in-tree
files. We'll continue to follow links for mailmap.file, as well as when
reading .mailmap from the current directory when outside of a repository
entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
feb9b7792f exclude: do not respect symlinks for in-tree .gitignore
As with .gitattributes, we would like to make sure that .gitignore files
are handled consistently whether read from the index or from the
filesystem. Likewise, we would like to avoid reading out-of-tree files
pointed to by the symlinks, which could have security implications in
certain setups.

We can cover both by using open_nofollow() when opening the in-tree
files. We'll continue to follow links for core.excludesFile, as well as
$GIT_DIR/info/exclude.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
2ef579e261 attr: do not respect symlinks for in-tree .gitattributes
The attributes system may sometimes read in-tree files from the
filesystem, and sometimes from the index. In the latter case, we do not
resolve symbolic links (and are not likely to ever start doing so).
Let's open filesystem links with O_NOFOLLOW so that the two cases behave
consistently.

As a bonus, this means that git will not follow such symlinks to read
and parse out-of-tree paths. In some cases this could have security
implications, as a malicious repository can cause Git to open and read
arbitrary files. It could already feed arbitrary content to the parser,
but in certain setups it might be able to exfiltrate data from those
paths (e.g., if an automated service operating on the malicious repo
reveals its stderr to an attacker).

Note that O_NOFOLLOW only prevents following links for the path itself,
not intermediate directories in the path.  At first glance, it seems
like

  ln -s /some/path in-repo

might still look at "in-repo/.gitattributes", following the symlink to
"/some/path/.gitattributes". However, if "in-repo" is a symbolic link,
then we know that it has no git paths below it, and will never look at
its .gitattributes file.

We will continue to support out-of-tree symbolic links (e.g., in
$GIT_DIR/info/attributes); this just affects in-tree links. When a
symbolic link is encountered, the contents are ignored and a warning is
printed. POSIX specifies ELOOP in this case, so the user would generally
see something like:

  warning: unable to access '.gitattributes': Too many levels of symbolic links

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
1679d60bfc exclude: add flags parameter to add_patterns()
There are a number of callers of add_patterns() and its sibling
functions. Let's give them a "flags" parameter for adding new options
without having to touch each caller. We'll use this in a future patch to
add O_NOFOLLOW support. But for now each caller just passes 0.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
dbf387d550 attr: convert "macro_ok" into a flags field
The attribute code can have a rather deep callstack, through
which we have to pass the "macro_ok" flag. In anticipation
of adding other flags, let's convert this to a generic
bit-field.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:32 -08:00
00611d8440 add open_nofollow() helper
Some callers of open() would like to use O_NOFOLLOW, but it is not
available on all platforms. Let's abstract this into a helper function
so we can provide system-specific implementations.

Some light web-searching reveals that we might be able to get something
similar on Windows using FILE_FLAG_OPEN_REPARSE_POINT. I didn't dig into
this further.

For other systems without O_NOFOLLOW or any equivalent, we have two
options for fallback:

  - we can just open anyway, following symlinks; this may have security
    implications (e.g., following untrusted in-tree symlinks)

  - we can determine whether the path is a symlink with lstat().

    This is slower (two syscalls instead of one), but that may be
    acceptable for infrequent uses like looking up .gitattributes files
    (especially because we can get away with a single syscall for the
    common case of ENOENT).

    It's also racy, but should be sufficient for our needs (we are
    worried about in-tree symlinks that we ourselves would have
    previously created). We could make it non-racy at the cost of making
    it even slower, by doing an fstat() on the opened descriptor and
    comparing the dev/ino fields to the original lstat().

This patch implements the lstat() option in its slightly-faster racy
form.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:32 -08:00
1eb4136ac2 diff: --{rotate,skip}-to=<path>
In the implementation of "git difftool", there is a case where the
user wants to start viewing the diffs at a specific path and
continue on to the rest, optionally wrapping around to the
beginning.  Since it is somewhat cumbersome to implement such a
feature as a post-processing step of "git diff" output, let's
support it internally with two new options.

 - "git diff --rotate-to=C", when the resulting patch would show
   paths A B C D E without the option, would "rotate" the paths to
   shows patch to C D E A B instead.  It is an error when there is
   no patch for C is shown.

 - "git diff --skip-to=C" would instead "skip" the paths before C,
   and shows patch to C D E.  Again, it is an error when there is no
   patch for C is shown.

 - "git log [-p]" also accepts these two options, but it is not an
   error if there is no change to the specified path.  Instead, the
   set of output paths are rotated or skipped to the specified path
   or the first path that sorts after the specified path.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:30:42 -08:00
f78cf97617 merge-ort: call diffcore_rename() directly
We want to pass additional information to diffcore_rename() (or some
variant thereof) without plumbing that extra information through
diff_tree_oid() and diffcore_std().  Further, since we will need to
gather additional special information related to diffs and are walking
the trees anyway in collect_merge_info(), it seems odd to have
diff_tree_oid()/diffcore_std() repeat those tree walks.  And there may
be times where we can avoid traversing into a subtree in
collect_merge_info() (based on additional information at our disposal),
that the basic diff logic would be unable to take advantage of.  For all
these reasons, just create the add and delete pairs ourself and then
call diffcore_rename() directly.

This change is primarily about enabling future optimizations; the
advantage of avoiding extra tree traversals is small compared to the
cost of rename detection, and the advantage of avoiding the extra tree
traversals is somewhat offset by the extra time spent in
collect_merge_info() collecting the additional data anyway.  However...

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       13.294 s ±  0.103 s    12.775 s ±  0.062 s
    mega-renames:    187.248 s ±  0.882 s   188.754 s ±  0.284 s
    just-one-mega:     5.557 s ±  0.017 s     5.599 s ±  0.019 s

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 18:02:16 -08:00
07c9a7fcb5 gitdiffcore doc: mention new preliminary step for rename detection
The last few patches have introduced a new preliminary step when rename
detection is on but both break detection and copy detection are off.
Document this new step.  While we're at it, add a testcase that checks
the new behavior as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 18:02:16 -08:00
bd24aa2f97 diffcore-rename: guide inexact rename detection based on basenames
Make use of the new find_basename_matches() function added in the last
two patches, to find renames more rapidly in cases where we can match up
files based on basenames.  As a quick reminder (see the last two commit
messages for more details), this means for example that
docs/extensions.txt and docs/config/extensions.txt are considered likely
renames if there are no remaining 'extensions.txt' files elsewhere among
the added and deleted files, and if a similarity check confirms they are
similar, then they are marked as a rename without looking for a better
similarity match among other files.  This is a behavioral change, as
covered in more detail in the previous commit message.

We do not use this heuristic together with either break or copy
detection.  The point of break detection is to say that filename
similarity does not imply file content similarity, and we only want to
know about file content similarity.  The point of copy detection is to
use more resources to check for additional similarities, while this is
an optimization that uses far less resources but which might also result
in finding slightly fewer similarities.  So the idea behind this
optimization goes against both of those features, and will be turned off
for both.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       13.815 s ±  0.062 s    13.294 s ±  0.103 s
    mega-renames:   1799.937 s ±  0.493 s   187.248 s ±  0.882 s
    just-one-mega:    51.289 s ±  0.019 s     5.557 s ±  0.017 s

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 18:02:16 -08:00
da09f65127 diffcore-rename: complete find_basename_matches()
It is not uncommon in real world repositories for the majority of file
renames to not change the basename of the file; i.e. most "renames" are
just a move of files into different directories.  We can make use of
this to avoid comparing all rename source candidates with all rename
destination candidates, by first comparing sources to destinations with
the same basenames.  If two files with the same basename are
sufficiently similar, we record the rename; if not, we include those
files in the more exhaustive matrix comparison.

This means we are adding a set of preliminary additional comparisons,
but for each file we only compare it with at most one other file.  For
example, if there was a include/media/device.h that was deleted and a
src/module/media/device.h that was added, and there are no other
device.h files in the remaining sets of added and deleted files after
exact rename detection, then these two files would be compared in the
preliminary step.

This commit does not yet actually employ this new optimization, it
merely adds a function which can be used for this purpose.  The next
commit will do the necessary plumbing to make use of it.

Note that this optimization might give us different results than without
the optimization, because it's possible that despite files with the same
basename being sufficiently similar to be considered a rename, there's
an even better match between files without the same basename.  I think
that is okay for four reasons: (1) it's easy to explain to the users
what happened if it does ever occur (or even for them to intuitively
figure out), (2) as the next patch will show it provides such a large
performance boost that it's worth the tradeoff, and (3) it's somewhat
unlikely that despite having unique matching basenames that other files
serve as better matches.  Reason (4) takes a full paragraph to
explain...

If the previous three reasons aren't enough, consider what rename
detection already does.  Break detection is not the default, meaning
that if files have the same _fullname_, then they are considered related
even if they are 0% similar.  In fact, in such a case, we don't even
bother comparing the files to see if they are similar let alone
comparing them to all other files to see what they are most similar to.
Basically, we override content similarity based on sufficient filename
similarity.  Without the filename similarity (currently implemented as
an exact match of filename), we swing the pendulum the opposite
direction and say that filename similarity is irrelevant and compare a
full N x M matrix of sources and destinations to find out which have the
most similar contents.  This optimization just adds another form of
filename similarity comparison, but augments it with a file content
similarity check as well.  Basically, if two files have the same
basename and are sufficiently similar to be considered a rename, mark
them as such without comparing the two to all other rename candidates.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 18:02:16 -08:00
a35df3371c diffcore-rename: compute basenames of source and dest candidates
We want to make use of unique basenames among remaining source and
destination files to help inform rename detection, so that more likely
pairings can be checked first.  (src/moduleA/foo.txt and
source/module/A/foo.txt are likely related if there are no other
'foo.txt' files among the remaining deleted and added files.)  Add a new
function, not yet used, which creates a map of the unique basenames
within rename_src and another within rename_dst, together with the
indices within rename_src/rename_dst where those basenames show up.
Non-unique basenames still show up in the map, but have an invalid index
(-1).

This function was inspired by the fact that in real world repositories,
files are often moved across directories without changing names.  Here
are some sample repositories and the percentage of their historical
renames (as of early 2020) that preserved basenames:
  * linux: 76%
  * gcc: 64%
  * gecko: 79%
  * webkit: 89%
These statistics alone don't prove that an optimization in this area
will help or how much it will help, since there are also unpaired adds
and deletes, restrictions on which basenames we consider, etc., but it
certainly motivated the idea to try something in this area.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 18:02:16 -08:00
f3845257a5 t4001: add a test comparing basename similarity and content similarity
Add a simple test where a removed file is similar to two different added
files; one of them has the same basename, and the other has a slightly
higher content similarity.  In the current test, content similarity is
weighted higher than filename similarity.

Subsequent commits will add a new rule that weighs a mixture of filename
similarity and content similarity in a manner that will change the
outcome of this testcase.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 18:02:16 -08:00
829514c515 diffcore-rename: filter rename_src list when possible
We have to look at each entry in rename_src a total of rename_dst_nr
times.  When we're not detecting copies, any exact renames or ignorable
rename paths will just be skipped over.  While checking that these can
be skipped over is a relatively cheap check, it's still a waste of time
to do that check more than once, let alone rename_dst_nr times.  When
rename_src_nr is a few thousand times bigger than the number of relevant
sources (such as when cherry-picking a commit that only touched a
handful of files, but from a side of history that has different names
for some high level directories), this time can add up.

First make an initial pass over the rename_src array and move all the
relevant entries to the front, so that we can iterate over just those
relevant entries.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       14.119 s ±  0.101 s    13.815 s ±  0.062 s
    mega-renames:   1802.044 s ±  0.828 s  1799.937 s ±  0.493 s
    just-one-mega:    51.391 s ±  0.028 s    51.289 s ±  0.019 s

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 18:02:16 -08:00
ee82a487f6 ref-filter: use pretty.c logic for trailers
Now, ref-filter is using pretty.c logic for setting trailer options.

New to ref-filter:
  :key=<K> - only show trailers with specified key.
  :valueonly[=val] - only show the value part.
  :separator=<SEP> - inserted between trailer lines.
  :key_value_separator=<SEP> - inserted between key and value in trailer lines

Enhancement to existing options(now can take value and its optional):
  :only[=val]
  :unfold[=val]

'val' can be: true, on, yes or false, off, no.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 16:48:38 -08:00
636a0aeedf pretty.c: capture invalid trailer argument
As we would like to use this trailers logic in the ref-filter, it's
nice to get an invalid trailer argument. This will allow us to print
precise error message while using `format_set_trailers_options()` in
ref-filter.

For capturing the invalid argument, we changed the working of
`format_set_trailers_options()` a little bit.
Original logic does "break" and fell through in mainly 2 cases -
    1. unknown/invalid argument
    2. end of the arg string

But now instead of "break", we capture invalid argument and return
non-zero. And non-zero is handled by the caller.
(We prepared the caller to handle non-zero in the previous commit).

Capturing invalid arguments this way will also affects the working
of current logic. As at the end of the arg string it will return non-zero.
So in order to make things correct, introduced an additional conditional
statement i.e if encounter ")", do 'break'.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 16:48:38 -08:00
90563aedca pretty.c: refactor trailer logic to format_set_trailers_options()
Refactored trailers formatting logic inside pretty.c to a new function
`format_set_trailers_options()`. This new function returns the non-zero
in case of unusual. The caller handles the non-zero by "goto trailers_out".

This change will allow us to reuse the same logic in other places.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 16:48:38 -08:00
727331dce1 t6300: use function to test trailer options
Add a function to test trailer options. This will make tests look cleaner,
as well as will make it easier to add new tests for trailers in the future.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 16:48:38 -08:00
328c109303 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-12 14:21:04 -08:00
8b25dee615 Merge branch 'tb/precompose-prefix-too'
When commands are started from a subdirectory, they may have to
compare the path to the subdirectory (called prefix and found out
from $(pwd)) with the tracked paths.  On macOS, $(pwd) and
readdir() yield decomposed path, while the tracked paths are
usually normalized to the precomposed form, causing mismatch.  This
has been fixed by taking the same approach used to normalize the
command line arguments.

* tb/precompose-prefix-too:
  MacOS: precompose_argv_prefix()
2021-02-12 14:21:04 -08:00
006c5f79be Merge branch 'jk/complete-branch-force-delete'
The command line completion (in contrib/) completed "git branch -d"
with branch names, but "git branch -D" offered tagnames in addition,
which has been corrected.  "git branch -M" had the same problem.

* jk/complete-branch-force-delete:
  doc/git-branch: fix awkward wording for "-c"
  completion: handle other variants of "branch -m"
  completion: treat "branch -D" the same way as "branch -d"
2021-02-12 14:21:04 -08:00
60f8121940 Merge branch 'jv/upload-pack-filter-spec-quotefix'
Fix in passing custom args from "git clone" to "upload-pack" on the
other side.

* jv/upload-pack-filter-spec-quotefix:
  t5544: clarify 'hook works with partial clone' test
  upload-pack.c: fix filter spec quoting bug
2021-02-12 14:21:04 -08:00
3c12d0b885 Merge branch 'tb/pack-revindex-on-disk'
Introduce an on-disk file to record revindex for packdata, which
traditionally was always created on the fly and only in-core.

* tb/pack-revindex-on-disk:
  t5325: check both on-disk and in-memory reverse index
  pack-revindex: ensure that on-disk reverse indexes are given precedence
  t: support GIT_TEST_WRITE_REV_INDEX
  t: prepare for GIT_TEST_WRITE_REV_INDEX
  Documentation/config/pack.txt: advertise 'pack.writeReverseIndex'
  builtin/pack-objects.c: respect 'pack.writeReverseIndex'
  builtin/index-pack.c: write reverse indexes
  builtin/index-pack.c: allow stripping arbitrary extensions
  pack-write.c: prepare to write 'pack-*.rev' files
  packfile: prepare for the existence of '*.rev' files
2021-02-12 14:21:04 -08:00
2c873f9791 Merge branch 'ab/tests-various-fixup'
Various test updates.

* ab/tests-various-fixup:
  rm tests: actually test for SIGPIPE in SIGPIPE test
  archive tests: use a cheaper "zipinfo -h" invocation to get header
  upload-pack tests: avoid a non-zero "grep" exit status
  git-svn tests: rewrite brittle tests to use "--[no-]merges".
  git svn mergeinfo tests: refactor "test -z" to use test_must_be_empty
  git svn mergeinfo tests: modernize redirection & quoting style
  cache-tree tests: explicitly test HEAD and index differences
  cache-tree tests: use a sub-shell with less indirection
  cache-tree tests: remove unused $2 parameter
  cache-tree tests: refactor for modern test style
2021-02-12 14:21:04 -08:00
f15eb7c1cf diffcore-rename: no point trying to find a match better than exact
diffcore_rename() had some code to avoid having destination paths that
already had an exact rename detected from being re-checked for other
renames.  Source paths, however, were re-checked because we wanted to
allow the possibility of detecting copies.  But if copy detection isn't
turned on, then this merely amounts to attempting to find a
better-than-exact match, which naturally ends up being an expensive
no-op.  In particular, copy detection is never turned on by the merge
machinery.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       14.263 s ±  0.053 s    14.119 s ±  0.101 s
    mega-renames:   5504.231 s ±  5.150 s  1802.044 s ±  0.828 s
    just-one-mega:   158.534 s ±  0.498 s    51.391 s ±  0.028 s

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-12 12:04:00 -08:00
e7884b353b test-lib-functions: assert correct parameter count
Add assertions of the correct parameter count of various functions, in
particularly the wrappers for the shell "test" built-in.

In an earlier commit we fixed a bug with an incorrect number of
arguments being passed to "test_path_is_{file,missing}". Let's also
guard other similar functions from the same sort of misuse.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-12 11:58:21 -08:00
45a2686441 test-lib-functions: remove bug-inducing "diagnostics" helper param
Remove the optional "diagnostics" parameter of the
test_path_is_{file,dir,missing} functions.

We have a lot of uses of these functions, but the only legitimate use
of the diagnostics parameter is from when the functions themselves
were introduced in 2caf20c52b (test-lib: user-friendly alternatives
to test [-d|-f|-e], 2010-08-10).

But as the the rest of this diff demonstrates its presence did more to
silently introduce bugs in our tests. Fix such bugs in the tests added
in ae4e89e549 (gc: add --keep-largest-pack option, 2018-04-15), and
c04ba51739 (t6046: testcases checking whether updates can be skipped
in a merge, 2018-04-19).

Let's also assert that those functions are called with exactly one
parameter, a follow-up commit will add similar asserts to other
functions in test-lib-functions.sh that we didn't have existing misuse
of.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-12 11:58:21 -08:00
ebd73f50c6 test libs: rename "diff-lib" to "lib-diff"
Rename the "diff-lib" to "lib-diff". With this rename and preceding
commits there is no remaining t/*lib* which doesn't follow the
convention of being called t/lib-*.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-12 11:58:21 -08:00
f011795891 Sync with maint 2021-02-11 13:58:52 -08:00
d3a035b055 Merge branch 'en/merge-ort-perf'
The "ort" merge strategy.

* en/merge-ort-perf:
  merge-ort: begin performance work; instrument with trace2_region_* calls
  merge-ort: ignore the directory rename split conflict for now
  merge-ort: fix massive leak
2021-02-11 13:58:44 -08:00
a21e27ef6b Merge branch 'en/ort-directory-rename'
ORT merge strategy learns to infer "renamed directory" while
merging.

* en/ort-directory-rename:
  merge-ort: fix a directory rename detection bug
  merge-ort: process_renames() now needs more defensiveness
  merge-ort: implement apply_directory_rename_modifications()
  merge-ort: add a new toplevel_dir field
  merge-ort: implement handle_path_level_conflicts()
  merge-ort: implement check_for_directory_rename()
  merge-ort: implement apply_dir_rename() and check_dir_renamed()
  merge-ort: implement compute_collisions()
  merge-ort: modify collect_renames() for directory rename handling
  merge-ort: implement handle_directory_level_conflicts()
  merge-ort: implement compute_rename_counts()
  merge-ort: copy get_renamed_dir_portion() from merge-recursive.c
  merge-ort: add outline of get_provisional_directory_renames()
  merge-ort: add outline for computing directory renames
  merge-ort: collect which directories are removed in dirs_removed
  merge-ort: initialize and free new directory rename data structures
  merge-ort: add new data structures for directory rename detection
2021-02-11 13:58:43 -08:00
f276e2a469 config: improve error message for boolean config
Currently invalid boolean config values return messages about 'bad
numeric', which is slightly misleading when the error was due to a
boolean value. We can improve the developer experience by returning a
boolean error message when we know the value is neither a bool text or
int.

before with an invalid boolean value of `non-boolean`, its unclear what
numeric is referring to:
  fatal: bad numeric config value 'non-boolean' for 'commit.gpgsign': invalid unit

now the error message mentions `non-boolean` is a bad boolean value:
  fatal: bad boolean config value 'non-boolean' for 'commit.gpgsign'

Signed-off-by: Andrew Klotz <agc.klotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:44:55 -08:00
488acf15df t7001: use test rather than [
According to Documentation/CodingGuidelines, we should use "test"
rather than "[ ... ]" in shell scripts, so let's replace the
"[ ... ]" with "test" in the t7001 test script.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:17 -08:00
39252c833e t7001: use here-docs instead of echo
Change from old style to current style by taking advantage of
here-docs instead of echo commands.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
5d683c3f4b t7001: put each command on a separate line
Modern practice is to avoid multiple commands per line, and
instead place each command on its own line.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
d2ecddc981 t7001: use '>' rather than 'touch'
Use `>` rather than `touch` to create an empty file when the
timestamp isn't relevant to the test.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
368d278249 t7001: avoid using cd outside of subshells
Avoid using `cd` outside of subshells since, if the test fails,
there is no guarantee that the current working directory is the
expected one, which may cause subsequent tests to run in the wrong
directory.

While at it, make some other tests more concise by replacing
simple subshells with `git -C`.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
dd72154149 t7001: remove whitespace after redirect operators
According to Documentation/CodingGuidelines, there should be no
whitespace after redirect operators. So, we should remove these
whitespaces after redirect operators.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
9bcaeb71a6 t7001: modernize subshell formatting
Some test use an old style for formatting subshells:

        (command &&
            ...

Update them to the modern style:

        (
            command &&
            ...

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
9b46e9c9cc t7001: remove unnecessary blank lines
Some tests use a deprecated style in which there are unnecessary
blank lines after the opening quote of the test body and before the
closing quote. So we should remove these unnecessary blank lines.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
a76d90670a t7001: indent with TABs instead of spaces
Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
5712d62ccf t7001: modernize test formatting
Some tests in this script are formatted using a very old style:

        test_expect_success \
            'title' \
            'body line 1 &&
            body line 2'

Update the formatting to the modern style:

        test_expect_success 'title' '
            body line 1 &&
            body line 2
        '

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
3e885f0277 stash: declare ref_stash as an array
Save sizeof(const char *) bytes by declaring ref_stash as an array
instead of having a redundant pointer to an array.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
8c2462d1fe t3905: use test_cmp() to check file contents
Modernize the script by doing file content comparisons using test_cmp()
instead of `test x = "$(cat file)"`.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
27e25a8cbf t3905: replace test -s with test_file_not_empty
In order to modernize the test script, replace `test -s` with
test_file_not_empty(), which provides better diagnostic output in the
case of failure.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
389ece4022 t3905: remove nested git in command substitution
If a git command in a nested command substitution fails, it will be
silently ignored since only the return code of the outer command
substitutions is reported. Factor out nested command substitutions so
that the error codes of those commands are reported.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
bbaa45c3aa t3905: move all commands into test cases
In order to modernize the tests, move commands that currently run
outside of test cases into a test case. Where possible, clean up files
that are produced using test_when_finished() but in the case where files
persist over multiple test cases, create a new test case to perform
cleanup.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
32b7385e43 t3905: remove spaces after redirect operators
For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove these
spaces wherever they appear.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
d6ab8b1929 git-stash.txt: be explicit about subcommand options
Currently, the options for the `list` and `show` subcommands are just
listed as `<options>`. This seems to imply, from a cursory glance at the
summary, that they take the stash options listed below. However, reading
more carefully, we see that they take log options and diff options
respectively.

Make it more obvious that they take log and diff options by explicitly
stating this in the subcommand summary.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
16950f8384 rev-list: add --disk-usage option for calculating disk usage
It can sometimes be useful to see which refs are contributing to the
overall repository size (e.g., does some branch have a bunch of objects
not found elsewhere in history, which indicates that deleting it would
shrink the size of a clone).

You can find that out by generating a list of objects, getting their
sizes from cat-file, and then summing them, like:

    git rev-list --objects --no-object-names main..branch
    git cat-file --batch-check='%(objectsize:disk)' |
    perl -lne '$total += $_; END { print $total }'

Though note that the caveats from git-cat-file(1) apply here. We "blame"
base objects more than their deltas, even though the relationship could
easily be flipped. Still, it can be a useful rough measure.

But one problem is that it's slow to run. Teaching rev-list to sum up
the sizes can be much faster for two reasons:

  1. It skips all of the piping of object names and sizes.

  2. If bitmaps are in use, for objects that are in the
     bitmapped packfile we can skip the oid_object_info()
     lookup entirely, and just ask the revindex for the
     on-disk size.

This patch implements a --disk-usage option which produces the same
answer in a fraction of the time. Here are some timings using a clone of
torvalds/linux:

  [rev-list piped to cat-file, no bitmaps]
  $ time git rev-list --objects --no-object-names --all |
    git cat-file --buffer --batch-check='%(objectsize:disk)' |
    perl -lne '$total += $_; END { print $total }'
  1459938510
  real	0m29.635s
  user	0m38.003s
  sys	0m1.093s

  [internal, no bitmaps]
  $ time git rev-list --disk-usage --objects --all
  1459938510
  real	0m31.262s
  user	0m30.885s
  sys	0m0.376s

Even though the wall-clock time is slightly worse due to parallelism,
notice the CPU savings between the two. We saved 21% of the CPU just by
avoiding the pipes.

But the real win is with bitmaps. If we use them without the new option:

  [rev-list piped to cat-file, bitmaps]
  $ time git rev-list --objects --no-object-names --all --use-bitmap-index |
    git cat-file --batch-check='%(objectsize:disk)' |
    perl -lne '$total += $_; END { print $total }'
  1459938510
  real	0m6.244s
  user	0m8.452s
  sys	0m0.311s

then we're faster to generate the list of objects, but we still spend a
lot of time piping and looking things up. But if we do both together:

  [internal, bitmaps]
  $ time git rev-list --disk-usage --objects --all --use-bitmap-index
  1459938510
  real	0m0.219s
  user	0m0.169s
  sys	0m0.049s

then we get the same answer much faster.

For "--all", that answer will correspond closely to "du objects/pack",
of course. But we're actually checking reachability here, so we're still
fast when we ask for more interesting things:

  $ time git rev-list --disk-usage --use-bitmap-index v5.0..v5.10
  374798628
  real	0m0.429s
  user	0m0.356s
  sys	0m0.072s

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:57:55 -08:00
c85eec7fc3 commit-graph: when incompatible with graphs, indicate why
When `gc.writeCommitGraph = true`, it is possible that the commit-graph
is _still_ not written: replace objects, grafts and shallow repositories
are incompatible with the commit-graph feature.

Under such circumstances, we need to indicate to the user why the
commit-graph was not written instead of staying silent about it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:33:01 -08:00
c809798b2a reflog expire --stale-fix: be generous about missing objects
Whenever a user runs `git reflog expire --stale-fix`, the most likely
reason is that their repository is at least _somewhat_ corrupt. Which
means that it is more than just possible that some objects are missing.

If that is the case, that can currently let the command abort through
the phase where it tries to mark all reachable objects.

Instead of adding insult to injury, let's be gentle and continue as best
as we can in such a scenario, simply by ignoring the missing objects and
moving on.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:21:52 -08:00
c45dc9cf30 diff: plug memory leak from regcomp() on {log,diff} -I
Fix a memory leak in 296d4a94e7 (diff: add -I<regex> that ignores
matching changes, 2020-10-20) by freeing the memory it allocates in
the newly introduced diff_free(). See the previous commit for details
on that.

This memory leak was intentionally introduced in 296d4a94e7, see the
discussion on a previous iteration of it in
https://lore.kernel.org/git/xmqqeelycajx.fsf@gitster.c.googlers.com/

At that time freeing the memory was somewhat tedious, but since it
isn't anymore with the newly introduced diff_free() let's use it.

Let's retain the pattern for diff_free_file() and add a
diff_free_ignore_regex(), even though (unlike "diff_free_file") we
don't need to call it elsewhere. I think this'll make for more
readable code than gradually accumulating a giant diff_free()
function, sharing "int i" across unrelated code etc.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:21:07 -08:00
e900d494dc diff: add an API for deferred freeing
Add a diff_free() function to free anything we may have allocated in
the "diff_options" struct, and the ability to make calling it a noop
by setting "no_free" in "diff_options".

This is required because when e.g. "git diff" is run we'll allocate
things in that struct, use the diff machinery once, and then exit.

But if we run e.g. "git log -p" we're going to re-use what we
allocated across multiple diff_flush() calls, and only want to free
things at the end.

We've thus ended up with features like the recently added "diff -I"[1]
where we'll leak memory. As it turns out it could have simply used the
pattern established in 6ea57703f6 (log: prepare log/log-tree to reuse
the diffopt.close_file attribute, 2016-06-22).

Manually adding more such flags to things log_tree_commit() every time
we need to allocate something would be tedious. Let's instead move
that fclose() code it to a new diff_free(), in anticipation of freeing
more things in that function in follow-up commits.

Some functions such as log_tree_commit() need an idiom of optionally
retaining a previous "no_free", as they may either free the memory
themselves, or their caller may do so. I'm keeping that idiom in
log_show_early() for good measure, even though I don't think it's
currently called in this manner. It also gets passed an existing
"struct rev_info", so future callers may want to set the "no_free"
flag.

This change is a bit hard to read because while the freeing pattern
we're introducing isn't unusual, the "file" member is a special
snowflake. We usually don't want to fclose() it. This is because
"file" is usually stdout, in which case we don't want to fclose()
it. We only want to opt-in to closing it when we e.g. open a file on
the filesystem. Thus the opt-in "close_file" flag.

So the API in general just needs a "no_free" flag to defer freeing,
but the "file" member still needs its "close_file" flag. This is made
more confusing because while refactoring this code we could replace
some "close_file=0" with "no_free=1", whereas others need to set both
flags.

This is because there were some cases where an existing "close_file=0"
meant "let's defer deallocation", and others where it meant "we don't
want to close this file handle at all".

1. 296d4a94e7 (diff: add -I<regex> that ignores matching changes,
   2020-10-20)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:21:05 -08:00
1108cea7f8 tests: remove most uses of test_i18ncmp
As a follow-up to d162b25f95 (tests: remove support for
GIT_TEST_GETTEXT_POISON, 2021-01-20) remove most uses of test_i18ncmp
via a simple s/test_i18ncmp/test_cmp/g search-replacement.

I'm leaving t6300-for-each-ref.sh out due to a conflict with in-flight
changes between "master" and "seen", as well as the prerequisite
itself due to other changes between "master" and "next/seen" which add
new test_i18ncmp uses.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:48:27 -08:00
b1e079807b tests: remove last uses of C_LOCALE_OUTPUT
Remove the last uses of the C_LOCALE_OUTPUT prerequisite as well as
the prerequisite itself. This is a follow-up to d162b25f95 (tests:
remove support for GIT_TEST_GETTEXT_POISON, 2021-01-20), as well as
the preceding commit where we removed the simpler uses of
C_LOCALE_OUTPUT.

Here I'm slightly refactoring a test added in 21e5ad50fc (safecrlf:
Add mechanism to warn about irreversible crlf conversions,
2008-02-06), as well as getting rid of another "test_have_prereq
C_LOCALE_OUTPUT" use.

I'm not leaving the prerequisite itself in place for in-flight changes
as there currently are none that introduce new tests that rely on it,
and because C_LOCALE_OUTPUT is currently a noop on the master branch
we likely won't have any new submissions that use it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:48:27 -08:00
a926c4b904 tests: remove most uses of C_LOCALE_OUTPUT
As a follow-up to d162b25f95 (tests: remove support for
GIT_TEST_GETTEXT_POISON, 2021-01-20) remove those uses of the now
always true C_LOCALE_OUTPUT prerequisite from those tests which
declare it as an argument to test_expect_{success,failure}.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:48:26 -08:00
780aa0a21e tests: remove last uses of GIT_TEST_GETTEXT_POISON=false
Follow-up my 73c01d25fe (tests: remove uses of
GIT_TEST_GETTEXT_POISON=false, 2021-01-20) by removing the last uses
of GIT_TEST_GETTEXT_POISON=*.

These assignments were part of branch that was in-flight at the time
of the gettext poison removal. See 466f94ec45 (Merge branch
'ab/detox-gettext-tests', 2021-02-10) and c7d6d419b0 (Merge branch
'ab/mktag', 2021-01-25) for the merging of the two branches.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:48:26 -08:00
fa9ab027ba docs: clarify that refs/notes/ do not keep the attached objects alive
`git help gc` contains this snippet:

  "[...] it will keep [..] objects referenced by the index,
  remote-tracking branches, notes saved by git notes under refs/notes/"

I had interpreted that as saying that the objects that notes were
attached to are kept, but that is not the case. Let's clarify the
documentation by moving out the part about git notes to a separate
sentence.

Signed-off-by: Martin von Zweigbergk <martinvonz@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:43:55 -08:00
9b27b49240 gpg-interface: remove other signature headers before verifying
When we have a multiply signed commit, we need to remove the signature
in the header before verifying the object, since the trailing signature
will not be over both pieces of data.  Do so, and verify that we
validate the signature appropriately.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:35:42 -08:00
88bce0e24c ref-filter: hoist signature parsing
When we parse a signature in the ref-filter code, we continually
increment the buffer pointer.  Hoist the signature parsing above the
blank line delimiting headers and body so we can find the signature when
using a header to sign the buffer.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:35:42 -08:00
937032e14a commit: allow parsing arbitrary buffers with headers
Currently only commits are signed with headers.  However, in the future,
we'll also sign tags with headers as well.  Let's refactor out a
function called parse_buffer_signed_by_header which does exactly that.
In addition, since we'll want to sign things other than commits this
way, let's call the function sign_with_header instead of do_sign_commit.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:35:42 -08:00
482c119186 gpg-interface: improve interface for parsing tags
We have a function which parses a buffer with a signature at the end,
parse_signature, and this function is used for signed tags.  However,
we'll need to store values for multiple algorithms, and we'll do this by
using a header for the non-default algorithm.

Adjust the parse_signature interface to store the parsed data in two
strbufs and turn the existing function into parse_signed_buffer.  The
latter is still used in places where we know we always have a signed
buffer, such as push certs.

Adjust all the callers to deal with this new interface.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:35:42 -08:00
c6102b7585 Merge branch 'tb/ci-run-cocci-with-18.04'
The version of Ubuntu Linux used by default at GitHub Actions CI
has been updated to one that lack coccinelle; until it gets fixed,
work it around by sticking to the previous release (18.04).

* tb/ci-run-cocci-with-18.04:
  .github/workflows/main.yml: run static-analysis on bionic
2021-02-10 16:48:07 -08:00
f9f2520108 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 14:48:33 -08:00
466f94ec45 Merge branch 'ab/detox-gettext-tests'
Get rid of "GETTEXT_POISON" support altogether, which may or may
not be controversial.

* ab/detox-gettext-tests:
  tests: remove uses of GIT_TEST_GETTEXT_POISON=false
  tests: remove support for GIT_TEST_GETTEXT_POISON
  ci: remove GETTEXT_POISON jobs
2021-02-10 14:48:33 -08:00
59ace284f3 Merge branch 'ab/grep-pcre-invalid-utf8'
Update support for invalid UTF-8 in PCRE2.

* ab/grep-pcre-invalid-utf8:
  grep/pcre2: better support invalid UTF-8 haystacks
  grep/pcre2 tests: don't rely on invalid UTF-8 data test
2021-02-10 14:48:33 -08:00
0199c68d01 Merge branch 'ab/retire-pcre1'
The support for deprecated PCRE1 library has been dropped.

* ab/retire-pcre1:
  Remove support for v1 of the PCRE library
  config.mak.uname: remove redundant NO_LIBPCRE1_JIT flag
2021-02-10 14:48:33 -08:00
938ecaa42f Merge branch 'jk/pretty-lazy-load-commit'
Some pretty-format specifiers do not need the data in commit object
(e.g. "%H"), but we were over-eager to load and parse it, which has
been made even lazier.

* jk/pretty-lazy-load-commit:
  pretty: lazy-load commit data when expanding user-format
2021-02-10 14:48:33 -08:00
2f794620f5 Merge branch 'ds/more-index-cleanups'
Cleaning various codepaths up.

* ds/more-index-cleanups:
  t1092: test interesting sparse-checkout scenarios
  test-lib: test_region looks for trace2 regions
  sparse-checkout: load sparse-checkout patterns
  name-hash: use trace2 regions for init
  repository: add repo reference to index_state
  fsmonitor: de-duplicate BUG()s around dirty bits
  cache-tree: extract subtree_pos()
  cache-tree: simplify verify_cache() prototype
  cache-tree: clean up cache_tree_update()
2021-02-10 14:48:33 -08:00
02fb21617e Merge branch 'rs/worktree-list-verbose'
`git worktree list` now annotates worktrees as prunable, shows
locked and prunable attributes in --porcelain mode, and gained
a --verbose option.

* rs/worktree-list-verbose:
  worktree: teach `list` verbose mode
  worktree: teach `list` to annotate prunable worktree
  worktree: teach `list --porcelain` to annotate locked worktree
  t2402: ensure locked worktree is properly cleaned up
  worktree: teach worktree_lock_reason() to gently handle main worktree
  worktree: teach worktree to lazy-load "prunable" reason
  worktree: libify should_prune_worktree()
2021-02-10 14:48:32 -08:00
7e94720c1e Merge branch 'js/rebase-i-commit-cleanup-fix'
When "git rebase -i" processes "fixup" insn, there is no reason to
clean up the commit log message, but we did the usual stripspace
processing.  This has been corrected.

* js/rebase-i-commit-cleanup-fix:
  rebase -i: do leave commit message intact in fixup! chains
2021-02-10 14:48:32 -08:00
e5abed92f5 Merge branch 'jk/t0000-cleanups'
Code clean-up.

* jk/t0000-cleanups:
  t0000: consistently use single quotes for outer tests
  t0000: run cleaning test inside sub-test
  t0000: run prereq tests inside sub-test
  t0000: keep clean-up tests together
2021-02-10 14:48:32 -08:00
04703f64be Merge branch 'sg/t7800-difftool-robustify'
Test fix.

* sg/t7800-difftool-robustify:
  t7800-difftool: don't accidentally match tmp dirs
2021-02-10 14:48:32 -08:00
c9f94ab4fa Merge branch 'ab/lose-grep-debug'
Lose the debugging aid that may have been useful in the past, but
no longer is, in the "grep" codepaths.

* ab/lose-grep-debug:
  grep/log: remove hidden --debug and --grep-debug options
2021-02-10 14:48:31 -08:00
9d5b1c06ac Merge branch 'jk/use-oid-pos'
Code clean-up to ensure our use of hashtables using object names as
keys use the "struct object_id" objects, not the raw hash values.

* jk/use-oid-pos:
  oid_pos(): access table through const pointers
  hash_pos(): convert to oid_pos()
  rerere: use strmap to store rerere directories
  rerere: tighten rr-cache dirname check
  rerere: check dirname format while iterating rr_cache directory
  commit_graft_pos(): take an oid instead of a bare hash
2021-02-10 14:48:31 -08:00
a5cdca4520 t1500: ensure current --since= behavior remains
This behavior of git-rev-parse is observed since git 1.8.3.1
at least(*), and likely earlier versions.

At least one git-reliant project in-the-wild relies on this
current behavior of git-rev-parse being able to handle multiple
--since= arguments without squeezing identical results together.
So add a test to prevent the potential for regression in
downstream projects.

(*) 1.8.3.1 the version packaged for CentOS 7.x

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 14:24:13 -08:00
fa153c1cd7 doc/rebase -i: fix typo in the documentation of 'fixup' command
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
9ff6b74bb7 t/t3437: fixup the test 'multiple fixup -c opens editor once'
In the test, FAKE_COMMIT_MESSAGE replaces the commit message each
time it is invoked so there will be only one instance of "Modified-A3"
no matter how many times we invoke the editor. Let's fix this and use
FAKE_COMMIT_AMEND instead so that it adds "Modified-A3" once for each
time the editor is invoked.

This patch also removes the check for counting the number of
"Modified-A3" lines and instead compares the whole message to check
that the commenting code works correctly for 'fixup -c' as well as
'fixup -C'.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
9c7650c45c t/t3437: use named commits in the tests
Use the named commits in the tests so that they will still refer to the
same commit if the setup gets changed in the future whereas 'branch~2'
will change which commit it points to.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
d8bd08066d t/t3437: simplify and document the test helpers
Let's simplify the test_commit_message() helper function and add
comments to the function.

This patch also document the working of 'fixup -C' with "amend!" in the
test-description.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
4755fed0a6 t/t3437: check the author date of fixed up commit
Add '%at' format in the get_author() function and update the test to
check that the author date of the fixed up commit is unchanged.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
733ad2e15a t/t3437: remove the dependency of 'expected-message' file from tests
As it is currently implemented, it's too difficult to follow along and
remember the value of "expected-message" from test to test. It also
makes it difficult to extend tests or add new tests in between existing
tests without negatively impacting other tests.

Let's set up "expected-message" to the precise content needed by the
test, so that both the problems go away and also makes easier to run
tests selectively with '--run' or 'GIT_SKIP_TESTS'

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
17665167bb t/t3437: fixup here-docs in the 'setup' test
The most common way to format here-docs in Git test scripts is for the
body and EOF to be indented the same amount as the command which opened
the here-doc. Fix a few here-docs in this script to conform to that
standard and also remove the unnecessary curly braces.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
75ace8329c t/lib-rebase: update the documentation of FAKE_LINES
FAKE_LINES helper function use underscore to embed a space in a single
command. Let's document it and also update the list of commands.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
f07871d302 rebase -i: clarify and fix 'fixup -c' rebase-todo help
When `-c` says "edit the commit message" it's not clear what will be
edited. The original's commit message or the replacement's message or a
combination of the two. Word it such that it states more precisely what
exactly will be edited. While at it, also drop the jarring period and
capitalization, neither of which is otherwise present in the message.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
59934417ff t/.gitattributes: sort lines
Sort the lines starting with "/", the only out-of-place line was added
along with most of the file in 614f4f0f35 (Fix the remaining tests
that failed with core.autocrlf=true, 2017-05-09).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:54:34 -08:00
ddfe900612 test-lib-functions: move function to lib-bitmap.sh
Move a function added to test-lib-functions.sh in ea047a8eb4 (t5310:
factor out bitmap traversal comparison, 2020-02-14) into a new
lib-bitmap.sh.

The test-lib-functions.sh file should be for functions that are widely
used across the test suite, if something's only used by a few tests it
makes more sense to have it in a lib-*.sh file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:54:34 -08:00
3fca1fc651 test libs: rename gitweb-lib.sh to lib-gitweb.sh
Rename gitweb-lib.sh to lib-gitweb.sh for consistency with other test
library files.

When it was introduced in 05526071cb (gitweb: split test suite into
library and tests, 2009-08-27) this naming pattern was more
common.

Since then all but one other such library which didn't start with
"lib-*.sh" such as t6000lib.sh has been been renamed, see
e.g. 9d488eb40e (Move t6000lib.sh to lib-*, 2010-05-07).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:54:34 -08:00
e8a8e7ff98 test libs: rename bundle helper to "lib-bundle.sh"
Rename the recently introduced test-bundle-functions.sh to be
consistent with other lib-*.sh files, which is the convention for
these sorts of shared test library functions.

The new test-bundle-functions.sh was introduced in 9901164d81 (test:
add helper functions for git-bundle, 2021-01-11). It was the only
test-*.sh of this nature.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:54:34 -08:00
f3ad2bf471 test-lib-functions: remove generate_zero_bytes() wrapper
Since d5cfd142ec (tests: teach the test-tool to generate NUL bytes
and use it, 2019-02-14) the generate_zero_bytes() functions has been a
thin wrapper for "test-tool genzeros". Let's have its only user call
that directly instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:54:34 -08:00
762ccf9906 test-lib-functions: move test_set_index_version() to its user
Move the test_set_index_version() function to its only user. This
function has only been used in one place since its addition in
5d9fc888b4 (test-lib: allow setting the index format version,
2014-02-23). Let's have that test script define it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:54:34 -08:00
9e9c7dd6f1 test lib: change "error" to "BUG" as appropriate
Change two uses of "error" in test-lib-functions.sh to "BUG".

In the first instance in "test_cmp_rev" the author of the "BUG"
function added in [1] had another in-flight patch adding this in [2],
and the two were never consolidated.

In the second case in "test_atexit" added in [3] that we could have
instead used "BUG" appears to have been missed.

1. 165293af3c (tests: send "bug in the test script" errors to the
   script's stderr, 2018-11-19)

2. 30d0b6dccb (test-lib-functions: make 'test_cmp_rev' more
   informative on failure, 2018-11-19)

3. 900721e15c (test-lib: introduce 'test_atexit', 2019-03-13)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:54:34 -08:00
c0eedbc009 test-lib: remove check_var_migration
Remove the check_var_migration() migration helper. This was added back
in [1], [2] and [3] to warn users to migrate from e.g. the
"GIT_FSMONITOR_TEST" name to "GIT_TEST_FSMONITOR".

I daresay that having been warning about this since late 2018 (or
v2.20.0) was sufficient time to give everyone interested a heads-up
about moving to the new names.

I don't see the need for going through the "do this later" codepath
anticipated in [1], let's just remove this instead.

1. 4cb54d0aa8 (fsmonitor: update GIT_TEST_FSMONITOR support,
   2018-09-18)
2. 1f357b045b (read-cache: update TEST_GIT_INDEX_VERSION support,
   2018-09-18)
3. 5765d97b71 (preload-index: update GIT_FORCE_PRELOAD_TEST support,
   2018-09-18)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:54:34 -08:00
a38cb9878a mailmap: only look for .mailmap in work tree
When trying to find a .mailmap file, we will always look for it in the
current directory. This makes sense in a repository with a working tree,
since we'd always go to the toplevel directory at startup. But for a
bare repository, it can be confusing. With an option like --git-dir (or
$GIT_DIR in the environment), we don't chdir at all, and we'd read
.mailmap from whatever directory you happened to be in before starting
Git.

(Note that --git-dir without specifying a working tree historically
means "the current directory is the root of the working tree", but most
bare repositories will have core.bare set these days, meaning they will
realize there is no working tree at all).

The documentation for gitmailmap(5) says:

  If the file `.mailmap` exists at the toplevel of the repository[...]

which likewise reinforces the notion that we are looking in the working
tree.

This patch prevents us from looking for such a file when we're in a bare
repository. This does break something that used to work:

  cd bare.git
  git cat-file blob HEAD:.mailmap >.mailmap
  git shortlog

But that was never advertised in the documentation. And these days we
have mailmap.blob (which defaults to HEAD:.mailmap) to do the same thing
in a much cleaner way.

However, there's one more interesting case: we might not have a
repository at all! The git-shortlog command can be run with git-log
output fed on its stdin, and it will apply the mailmap. In that case, it
probably does make sense to read .mailmap from the current directory.
This patch will continue to do so.

That leads to one even weirder case: if you run git-shortlog to process
stdin, the input _could_ be from a different repository entirely. Should
we respect the in-tree .mailmap then? Probably yes. Whatever the source
of the input, if shortlog is running in a repository, the documentation
claims that we'd read the .mailmap from its top-level (and of course
it's reasonably likely that it _is_ from the same repo, and the user
just preferred to run git-log and git-shortlog separately for whatever
reason).

The included test covers these cases, and we now document the "no repo"
case explicitly.

We also add a test that confirms we find a top-level ".mailmap" even
when we start in a subdirectory of the working tree. This worked both
before and after this commit, but we never tested it explicitly (it
works because we always chdir to the top-level of the working tree if
there is one).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:34:51 -08:00
e89f89361c fsck --name-objects: be more careful parsing generation numbers
In 7b35efd734 (fsck_walk(): optionally name objects on the go,
2016-07-17), the `fsck` machinery learned to optionally name the
objects, so that it is easier to see what part of the repository is in a
bad shape, say, when objects are missing.

To save on complexity, this machinery uses a parser to determine the
name of a parent given a commit's name: any `~<n>` suffix is parsed and
the parent's name is formed from the prefix together with `~<n+1>`.

However, this parser has a bug: if it finds a suffix `<n>` that is _not_
`~<n>`, it will mistake the empty string for the prefix and `<n>` for
the generation number. In other words, it will generate a name of the
form `~<bogus-number>`.

Let's fix this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 12:38:05 -08:00
8c891eed3a t1450: robustify remove_object()
This function can be simplified by using the `test_oid_to_path()`
helper, which incidentally also makes it more robust by not relying on
the exact file system layout of the loose object files.

While at it, do not define those functions in a test case, it buys us
nothing.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 12:38:00 -08:00
42d906bec4 grep: honor sparse-checkout on working tree searches
On a sparse checked out repository, `git grep` (without --cached) ends
up searching the cache when an entry matches the search pathspec and has
the SKIP_WORKTREE bit set. This is confusing both because the sparse
paths are not expected to be in a working tree search (as they are not
checked out), and because the output mixes working tree and cache
results without distinguishing them. (Note that grep also resorts to the
cache on working tree searches that include --assume-unchanged paths.
But the whole point in that case is to assume that the contents of the
index entry and the file are the same. This does not apply to the case
of sparse paths, where the file isn't even expected to be present.)

Fix that by teaching grep to honor the sparse-checkout rules for working
tree searches. If the user wants to grep paths outside the current
sparse-checkout definition, they may either update the sparsity rules to
materialize the files, or use --cached to search all blobs registered in
the index.

Note: it might also be interesting to add a configuration option that
allow users to search paths that are present despite having the
SKIP_WORKTREE bit set, and/or to restrict searches in the index and past
revisions too. These ideas are left as future improvements to avoid
conflicting with other sparse-checkout topics currently in flight.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 23:10:51 -08:00
acc1c4d5d4 maintenance: incremental strategy runs pack-refs weekly
When the 'maintenance.strategy' config option is set to 'incremental',
a default maintenance schedule is enabled. Add the 'pack-refs' task to
that strategy at the weekly cadence.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 23:09:29 -08:00
41abfe15d9 maintenance: add pack-refs task
It is valuable to collect loose refs into a more compressed form. This
is typically the packed-refs file, although this could be the reftable
in the future. Having packed refs can be extremely valuable in repos
with many tags or remote branches that are not modified by the local
user, but still are necessary for other queries.

For instance, with many exploded refs, commands such as

	git describe --tags --exact-match HEAD

can be very slow (multiple seconds). This command in particular is used
by terminal prompts to show when a detatched HEAD is pointing to an
existing tag, so having it be slow causes significant delays for users.

Add a new 'pack-refs' maintenance task. It runs 'git pack-refs --all
--prune' to move loose refs into a packed form. For now, that is the
packed-refs file, but could adjust to other file formats in the future.

This is the first of several sub-tasks of the 'gc' task that could be
extracted to their own tasks. In this process, we should not change the
behavior of the 'gc' task since that remains the default way to keep
repositories maintained. Creating a new task for one of these sub-tasks
only provides more customization options for those choosing to not use
the 'gc' task. It is certainly possible to have both the 'gc' and
'pack-refs' tasks enabled and run regularly. While they may repeat
effort, they do not conflict in a destructive way.

The 'auto_condition' function pointer is left NULL for now. We could
extend this in the future to have a condition check if pack-refs should
be run during 'git maintenance run --auto'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 23:09:24 -08:00
0a9dde4a04 usage: trace2 BUG() invocations
die() messages are traced in trace2, but BUG() messages are not. Anyone
tracking die() messages would have even more reason to track BUG().
Therefore, write to trace2 when BUG() is invoked.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 14:14:34 -08:00
9d9cf23031 mergetool: add per-tool support and overrides for the hideResolved flag
Add a per-tool override flag so that users may enable the flag for one
tool and disable it for another by setting
`mergetool.<tool>.hideResolved` to `false`.

In addition, the author or maintainer of a mergetool may optionally
override the default `hideResolved` value for that mergetool. If the
`mergetools/<tool>` shell script contains a `hide_resolved_enabled`
function it will be called when the mergetool is invoked and the return
value will be used as the default for the `hideResolved` flag.

    hide_resolved_enabled () {
        return 1
    }

Disabling may be desirable if the mergetool wants or needs access to the
original, unmodified 'LOCAL' and 'REMOTE' versions of the conflicted
file. For example:

- A tool may use a custom conflict resolution algorithm and prefer to
  ignore the results of Git's conflict resolution.
- A tool may want to visually compare/constrast the version of the file
  from before the merge (saved to 'LOCAL', 'REMOTE', and 'BASE') with
  Git's conflict resolution results (saved to 'MERGED').

Helped-by: Johannes Sixt <j6t@kdbg.org>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Seth House <seth@eseth.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 14:09:16 -08:00
de8dafbada mergetool: break setup_tool out into separate initialization function
This is preparation for the following commit where we need to source the
mergetool shell script to look for overrides before `run_merge_tool` is
called. Previously `run_merge_tool` both sourced that script and invoked
the mergetool.

In the case of the following commit, we need the result of the
`hide_resolved` override, if present, before we actually run
`run_merge_tool`.

The new `initialize_merge_tool` wrapper is exposed and documented as
a public interface for consistency with the existing `run_merge_tool`
which is also public. Although `setup_tool` could instead be exposed
directly, the related `setup_user_tool` would probably also want to be
elevated to match and this felt the cleanest to me.

Signed-off-by: Seth House <seth@eseth.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 14:09:16 -08:00
98ea309b3f mergetool: add hideResolved configuration
The purpose of a mergetool is to help the user resolve any conflicts
that Git cannot automatically resolve. If there is a conflict that must
be resolved manually Git will write a file named MERGED which contains
everything Git was able to resolve by itself and also everything that it
was not able to resolve wrapped in conflict markers.

One way to think of MERGED is as a two- or three-way diff. If each
"side" of the conflict markers is separately extracted an external tool
can represent those conflicts as a side-by-side diff.

However many mergetools instead diff LOCAL and REMOTE both of which
contain versions of the file from before the merge. Since the conflicts
Git resolved automatically are not present it forces the user to
manually re-resolve those conflicts. Some mergetools also show MERGED
but often only for reference and not as the focal point to resolve the
conflicts.

This adds a `mergetool.hideResolved` flag that will overwrite LOCAL and
REMOTE with each corresponding "side" of a conflicted file and thus hide
all conflicts that Git was able to resolve itself. Overwriting these
files will immediately benefit any mergetool that uses them without
requiring any changes to the tool.

No adverse effects were noted in a small survey of popular mergetools[1]
so this behavior defaults to `true`. However it can be globally disabled
by setting `mergetool.hideResolved` to `false`.

[1] https://www.eseth.org/2020/mergetools.html
    c884424769/2020/mergetools.md

Original-implementation-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Seth House <seth@eseth.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 14:09:16 -08:00
3803a3a099 t: add --no-tag option to test_commit
One of the conveniences that test_commit offers is making a tag for each
commit. This makes it easy to refer to the commits in subsequent
commands. But it can also be a pain if you care about reachability,
because those tags keep the commits reachable even if they are rewound
from the branch they're made on.

The alternative is that scripts have to call test_tick, git-add, and
git-commit themselves. Let's add a --no-tag option to give them the
one-liner convenience of using test_commit.

This is in preparation for the next patch, which will add some more
calls. But I cleaned up an existing site to show off the feature. There
are probably more cleanups possible.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 13:36:06 -08:00
0c5d83b248 grep: error out if --untracked is used with --cached
The options --untracked and --cached are not compatible, but if they are
used together, grep just silently ignores --cached and searches the
working tree. Error out, instead, to avoid any potential confusion.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 12:39:06 -08:00
1d4f2316c5 Sync with 2.30.1 2021-02-08 14:44:42 -08:00
1f9696019a sequencer: rename a few functions
Rename functions to make them more descriptive and while at it, remove
unnecessary 'inline' of the skip_fixupish() function.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-08 13:09:57 -08:00
a25314c1ec sequencer: fixup the datatype of the 'flag' argument
As 'flag' is a combination of bits, so change its datatype from
'enum todo_item_flags' to 'unsigned'.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-08 13:09:57 -08:00
2cc543deab range-diff(docs): explain how to specify commit ranges
There are three forms, depending whether the user specifies one, two or
three non-option arguments. We've never actually explained how this
works in the manual, so let's explain it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:24:55 -08:00
359f0d754a range-diff/format-patch: handle commit ranges other than A..B
In the `SPECIFYING RANGES` section of gitrevisions[7], two ways are
described to specify commit ranges that `range-diff` does not yet
accept: "<commit>^!" and "<commit>^-<n>".

Let's accept them, by parsing them via the revision machinery and
looking for at least one interesting and one uninteresting revision in
the resulting `pending` array.

This also finally lets us reject arguments that _do_ contain `..` but
are not actually ranges, e.g. `HEAD^{/do.. match this}`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:24:55 -08:00
1e79f97326 range-diff: offer --left-only/--right-only options
When comparing commit ranges, one is frequently interested only in one
side, such as asking the question "Has this patch that I submitted to
the Git mailing list been applied?": one would only care about the part
of the output that corresponds to the commits in a local branch.

To make that possible, imitate the `git rev-list` options `--left-only`
and `--right-only`.

This addresses https://github.com/gitgitgadget/git/issues/206

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:14:31 -08:00
3e6046edad range-diff: move the diffopt initialization down one layer
It is actually only the `output()` function that uses those diffopts. By
moving the diffopt initialization down into that function, it is
encapsulated better.

Incidentally, it will also make it easier to implement the `--left-only`
and `--right-only` options in `git range-diff` because the `output()`
function is now receiving all range-diff options as a parameter, not
just the diffopts.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:14:31 -08:00
f1ce6c191e range-diff: combine all options in a single data structure
This will make it easier to implement the `--left-only` and
`--right-only` options.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:14:31 -08:00
fb7fa4a1fd Sync with maint 2021-02-05 16:41:17 -08:00
4527ecdc8d The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 16:40:46 -08:00
4513f6bbb1 Merge branch 'sg/test-stress-jobs'
Test framework fix.

* sg/test-stress-jobs:
  test-lib: prevent '--stress-jobs=X' from being ignored
2021-02-05 16:40:46 -08:00
dfc3c2b224 Merge branch 'jk/weather-balloon-require-variadic-macro'
We've carried compatibility codepaths for compilers without
variadic macros for quite some time, but the world may be ready for
them to be removed.  Force compilation failure on exotic platforms
where variadic macros are not available to find out who screams in
such a way that we can easily revert if it turns out that the world
is not yet ready.

* jk/weather-balloon-require-variadic-macro:
  git-compat-util: always enable variadic macros
2021-02-05 16:40:46 -08:00
b6c90a2a22 Merge branch 'pb/ci-matrix-wo-shortcut'
Our setting of GitHub CI test jobs were a bit too eager to give up
once there is even one failure found.  Tweak the knob to allow
other jobs keep running even when we see a failure, so that we can
find more failures in a single run.

* pb/ci-matrix-wo-shortcut:
  ci: do not cancel all jobs of a matrix if one fails
2021-02-05 16:40:46 -08:00
61b159e219 Merge branch 'pb/blame-funcname-range-userdiff'
Test fix.

* pb/blame-funcname-range-userdiff:
  annotate-tests: quote variable expansions containing path names
2021-02-05 16:40:45 -08:00
4cc0e8794d Merge branch 'jk/p5303-sed-portability-fix'
A perf script was made more portable.

* jk/p5303-sed-portability-fix:
  p5303: avoid sed GNU-ism
2021-02-05 16:40:45 -08:00
77db59c2f9 Merge branch 'jv/pack-objects-narrower-ref-iteration'
The "pack-objects" command needs to iterate over all the tags when
automatic tag following is enabled, but it actually iterated over
all refs and then discarded everything outside "refs/tags/"
hierarchy, which was quite wasteful.

* jv/pack-objects-narrower-ref-iteration:
  builtin/pack-objects.c: avoid iterating all refs
2021-02-05 16:40:45 -08:00
f6ef8baba2 Merge branch 'ph/use-delete-refs'
When removing many branches and tags, the code used to do so one
ref at a time.  There is another API it can use to delete multiple
refs, and it makes quite a lot of performance difference when the
refs are packed.

* ph/use-delete-refs:
  use delete_refs when deleting tags or branches
2021-02-05 16:40:45 -08:00
6254fa1359 Merge branch 'tb/ls-refs-optim'
The ls-refs protocol operation has been optimized to narrow the
sub-hierarchy of refs/ it walks to produce response.

* tb/ls-refs-optim:
  ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets
  ls-refs.c: initialize 'prefixes' before using it
  refs: expose 'for_each_fullref_in_prefixes'
2021-02-05 16:40:45 -08:00
5198426d91 Merge branch 'zh/ls-files-deduplicate'
"git ls-files" can and does show multiple entries when the index is
unmerged, which is a source for confusion unless -s/-u option is in
use.  A new option --deduplicate has been introduced.

* zh/ls-files-deduplicate:
  ls-files.c: add --deduplicate option
  ls_files.c: consolidate two for loops into one
  ls_files.c: bugfix for --deleted and --modified
2021-02-05 16:40:44 -08:00
a0a2d75d3b Merge branch 'ds/cache-tree-basics'
Document, clean-up and optimize the code around the cache-tree
extension in the index.

* ds/cache-tree-basics:
  cache-tree: speed up consecutive path comparisons
  cache-tree: use ce_namelen() instead of strlen()
  index-format: discuss recursion of cache-tree better
  index-format: update preamble to cache tree extension
  index-format: use 'cache tree' over 'cached tree'
  cache-tree: trace regions for prime_cache_tree
  cache-tree: trace regions for I/O
  cache-tree: use trace2 in cache_tree_update()
  unpack-trees: add trace2 regions
  tree-walk: report recursion counts
2021-02-05 16:40:44 -08:00
b65b9ff1ff Merge branch 'en/ort-conflict-handling'
ORT merge strategy learns more support for merge conflicts.

* en/ort-conflict-handling:
  merge-ort: add handling for different types of files at same path
  merge-ort: copy find_first_merges() implementation from merge-recursive.c
  merge-ort: implement format_commit()
  merge-ort: copy and adapt merge_submodule() from merge-recursive.c
  merge-ort: copy and adapt merge_3way() from merge-recursive.c
  merge-ort: flesh out implementation of handle_content_merge()
  merge-ort: handle book-keeping around two- and three-way content merge
  merge-ort: implement unique_path() helper
  merge-ort: handle directory/file conflicts that remain
  merge-ort: handle D/F conflict where directory disappears due to merge
2021-02-05 16:40:44 -08:00
aac006aa99 Merge branch 'so/log-diff-merge'
"git log" learned a new "--diff-merges=<how>" option.

* so/log-diff-merge: (32 commits)
  t4013: add tests for --diff-merges=first-parent
  doc/git-show: include --diff-merges description
  doc/rev-list-options: document --first-parent changes merges format
  doc/diff-generate-patch: mention new --diff-merges option
  doc/git-log: describe new --diff-merges options
  diff-merges: add '--diff-merges=1' as synonym for 'first-parent'
  diff-merges: add old mnemonic counterparts to --diff-merges
  diff-merges: let new options enable diff without -p
  diff-merges: do not imply -p for new options
  diff-merges: implement new values for --diff-merges
  diff-merges: make -m/-c/--cc explicitly mutually exclusive
  diff-merges: refactor opt settings into separate functions
  diff-merges: get rid of now empty diff_merges_init_revs()
  diff-merges: group diff-merge flags next to each other inside 'rev_info'
  diff-merges: split 'ignore_merges' field
  diff-merges: fix -m to properly override -c/--cc
  t4013: add tests for -m failing to override -c/--cc
  t4013: support test_expect_failure through ':failure' magic
  diff-merges: revise revs->diff flag handling
  diff-merges: handle imply -p on -c/--cc logic for log.c
  ...
2021-02-05 16:40:44 -08:00
eb9071912f commit-graph: anonymize data in chunk_write_fn
In preparation for creating an API around file formats using chunks and
tables of contents, prepare the commit-graph write code to use
prototypes that will match this new API.

Specifically, convert chunk_write_fn to take a "void *data" parameter
instead of the commit-graph-specific "struct write_commit_graph_context"
pointer.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 15:40:41 -08:00
4f37d45706 clone: respect remote unborn HEAD
Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 13:49:55 -08:00
39835409d1 connect, transport: encapsulate arg in struct
In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs().

In preparation for that, encapsulate the existing ref_prefixes
parameter into a struct. The aforementioned unborn current branch will
go into this new struct in the future patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 13:49:54 -08:00
59e1205d16 ls-refs: report unborn targets of symrefs
When cloning, we choose the default branch based on the remote HEAD.
But if there is no remote HEAD reported (which could happen if the
target of the remote HEAD is unborn), we'll fall back to using our local
init.defaultBranch. Traditionally this hasn't been a big deal, because
most repos used "master" as the default. But these days it is likely to
cause confusion if the server and client implementations choose
different values (e.g., if the remote started with "main", we may choose
"master" locally, create commits there, and then the user is surprised
when they push to "master" and not "main").

To solve this, the remote needs to communicate the target of the HEAD
symref, even if it is unborn, and "git clone" needs to use this
information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (by default, this is advertised, but
server administrators may turn this off through the lsrefs.unborn
config). This feature indicates that "ls-refs" supports the "unborn"
argument; when it is specified, "ls-refs" will send the HEAD symref with
the name of its unborn target.

This change is only for protocol v2. A similar change for protocol v0
would require independent protocol design (there being no analogous
position to signal support for "unborn") and client-side plumbing of the
data required, so the scope of this patch set is limited to protocol v2.

The client side will be updated to use this in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 13:49:53 -08:00
6eda9ac9e5 doc: use https links
Use only https links for lore.kernel.org.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 11:57:10 -08:00
1d18997007 doc hash-function-transition: move rationale upwards
Move rationale for new hash function to beginning of document
so that it appears before the concrete move to SHA-256 is described.

Remove some of the details about SHA-1 weaknesses and add references
to the details on how the new hash function was chosen instead.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 11:57:10 -08:00
cc9f0916bd doc hash-function-transition: fix incomplete sentence
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 11:57:02 -08:00
810372f881 doc hash-function-transition: use upper case consistently
Use upper case consistently in Document History.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 11:57:02 -08:00
af9b1e9aba doc hash-function-transition: use SHA-1 and SHA-256 consistently
Use SHA-1 and SHA-256 instead of sha1 and sha256  when referring
to the hash type.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 11:57:02 -08:00
de82095a95 doc hash-function-transition: fix asciidoc output
Asciidoc requires lists to start with an empty line and uses
different characters for indentation levels ("-", "*", "**", ...).
For special symbols like a dash "--" has to be used and there is
no double arrow "<->", so a left and right arrow "<-->" has to be
combined for that. Lastly for verbatim output a newline followed
by an indentation has to be used.

Fix asciidoc output for lists, special characters and verbatim
text while retaining the readabilty of the original text file.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 11:57:02 -08:00
5189bb8724 range-diff: simplify code spawning git log
Previously, we waited for the child process to be finished in every
failing code path as well as at the end of the function
`show_range_diff()`.

However, we do not need to wait that long. Directly after reading the
output of the child process, we can wrap up the child process.

This also has the advantage that we don't do a bunch of unnecessary work
in case `finish_command()` returns with an error anyway.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-04 17:16:42 -08:00
a2d474adf3 range-diff: libify the read_patches() function again
In library functions, we do want to avoid the (simple, but rather final)
`die()` calls, instead returning with a value indicating an error.

Let's do exactly that in the code introduced in b66885a30c
(range-diff: add section header instead of diff header, 2019-07-11) that
wants to error out if a diff header could not be parsed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-04 17:16:42 -08:00
8c29b49794 range-diff: avoid leaking memory in two error code paths
In the code paths in question, we already release a lot of memory, but
the `current_filename` variable was missed. Fix that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-04 17:16:42 -08:00
30b29f044a The fifth batch 2021-02-03 15:04:49 -08:00
22f2bce651 Merge branch 'jk/run-command-use-shell-doc'
The .use_shell flag in struct child_process that is passed to
run_command() API has been clarified with a bit more documentation.

* jk/run-command-use-shell-doc:
  run-command: document use_shell option
2021-02-03 15:04:49 -08:00
973e20b83f Merge branch 'jk/peel-iterated-oid'
The peel_ref() API has been replaced with peel_iterated_oid().

* jk/peel-iterated-oid:
  refs: switch peel_ref() to peel_iterated_oid()
2021-02-03 15:04:49 -08:00
6cd7f9dc29 Merge branch 'js/skip-dashed-built-ins-from-config-mak'
Build fix.

* js/skip-dashed-built-ins-from-config-mak:
  SKIP_DASHED_BUILT_INS: respect `config.mak`
2021-02-03 15:04:49 -08:00
d03553ecd1 Merge branch 'jt/packfile-as-uri-doc'
Doc fix for packfile URI feature.

* jt/packfile-as-uri-doc:
  Doc: clarify contents of packfile sent as URI
2021-02-03 15:04:49 -08:00
15bf48b987 Merge branch 'ds/maintenance-prefetch-cleanup'
Test clean-up plus UI improvement by hiding extra refs that
the prefetch task uses from "log --decorate" output.

* ds/maintenance-prefetch-cleanup:
  t7900: clean up some broken refs
  maintenance: set log.excludeDecoration durin prefetch
2021-02-03 15:04:48 -08:00
18e3f5a944 Merge branch 'ab/fsck-doc-fix'
Documentation for "git fsck" lost stale bits that has become
incorrect.

* ab/fsck-doc-fix:
  fsck doc: remove ancient out-of-date diagnostics
2021-02-03 15:04:48 -08:00
97b8294474 bisect--helper: retire --check-and-set-terms subcommand
The `--check-and-set-terms` subcommand is no longer from the
git-bisect.sh shell script. Instead the function
`check_and_set_terms()` is called from the C implementation.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:09 -08:00
e4c7b33747 bisect--helper: reimplement bisect_skip shell function in C
Reimplement the `bisect_skip()` shell function in C and also add
`bisect-skip` subcommand to `git bisect--helper` to call it from
git-bisect.sh

Using `--bisect-skip` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:09 -08:00
9feea34810 bisect--helper: retire --bisect-auto-next subcommand
The --bisect-auto-next subcommand is no longer used from the
git-bisect.sh shell script. Instead the function bisect_auto_next()
is directly called from the C implementation.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:09 -08:00
b7a6f163d6 bisect--helper: use res instead of return in BISECT_RESET case option
Use `res` variable to store `bisect_reset()` output in BISECT_RESET
case option to make bisect--helper.c more consistent.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:09 -08:00
68efed8c8a bisect--helper: retire --bisect-write subcommand
The `--bisect-write` subcommand is no longer used from the
git-bisect.sh shell script. Instead the function `bisect_write()`
is directly called from the C implementation.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:08 -08:00
2b1fd947f6 bisect--helper: reimplement bisect_replay shell function in C
Reimplement the `bisect_replay` shell function in C and also add
`--bisect-replay` subcommand to `git bisect--helper` to call it from
git-bisect.sh

Using `--bisect-replay` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:08 -08:00
97d5ba6a39 bisect--helper: reimplement bisect_log shell function in C
Reimplement the `bisect_log()` shell function in C and also add
`--bisect-log` subcommand to `git bisect--helper` to call it from
git-bisect.sh .

Using `--bisect-log` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:08 -08:00
27dc071b9a doc/git-branch: fix awkward wording for "-c"
The description for "-c" is hard to parse. I think the big issue is lack
of commas, but I've also reordered the words to keep the main focus
point of "instead of renaming, copy" together.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:14:31 -08:00
bca362c1f9 completion: handle other variants of "branch -m"
We didn't special-case "branch -M" (with a capital M) the same as
"branch -m", nor any of the "--copy" variants. As a result these offered
any ref as the next candidate, and not just branch names.

Note that I rewrapped case-arm line since it's now quite long, and
likewise the one below it for consistency. I also re-ordered the
existing "-D" to make it more obvious how the cases group together.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:14:24 -08:00
5c327502db MacOS: precompose_argv_prefix()
The following sequence leads to a "BUG" assertion running under MacOS:

  DIR=git-test-restore-p
  Adiarnfd=$(printf 'A\314\210')
  DIRNAME=xx${Adiarnfd}yy
  mkdir $DIR &&
  cd $DIR &&
  git init &&
  mkdir $DIRNAME &&
  cd $DIRNAME &&
  echo "Initial" >file &&
  git add file &&
  echo "One more line" >>file &&
  echo y | git restore -p .

 Initialized empty Git repository in /tmp/git-test-restore-p/.git/
 BUG: pathspec.c:495: error initializing pathspec_item
 Cannot close git diff-index --cached --numstat
 [snip]

The command `git restore` is run from a directory inside a Git repo.
Git needs to split the $CWD into 2 parts:
The path to the repo and "the rest", if any.
"The rest" becomes a "prefix" later used inside the pathspec code.

As an example, "/path/to/repo/dir-inside-repå" would determine
"/path/to/repo" as the root of the repo, the place where the
configuration file .git/config is found.

The rest becomes the prefix ("dir-inside-repå"), from where the
pathspec machinery expands the ".", more about this later.
If there is a decomposed form, (making the decomposing visible like this),
"dir-inside-rep°a" doesn't match "dir-inside-repå".

Git commands need to:

 (a) read the configuration variable "core.precomposeunicode"
 (b) precocompose argv[]
 (c) precompose the prefix, if there was any

The first commit,
76759c7dff "git on Mac OS and precomposed unicode"
addressed (a) and (b).

The call to precompose_argv() was added into parse-options.c,
because that seemed to be a good place when the patch was written.

Commands that don't use parse-options need to do (a) and (b) themselfs.

The commands `diff-files`, `diff-index`, `diff-tree` and `diff`
learned (a) and (b) in
commit 90a78b83e0 "diff: run arguments through precompose_argv"

Branch names (or refs in general) using decomposed code points
resulting in decomposed file names had been fixed in
commit 8e712ef6fc "Honor core.precomposeUnicode in more places"

The bug report from above shows 2 things:
- more commands need to handle precomposed unicode
- (c) should be implemented for all commands using pathspecs

Solution:
precompose_argv() now handles the prefix (if needed), and is renamed into
precompose_argv_prefix().

Inside this function the config variable core.precomposeunicode is read
into the global variable precomposed_unicode, as before.
This reading is skipped if precomposed_unicode had been read before.

The original patch for preocomposed unicode, 76759c7dff, placed
precompose_argv() into parse-options.c

Now add it into git.c::run_builtin() as well.  Existing precompose
calls in diff-files.c and others may become redundant, and if we
audit the callflows that reach these places to make sure that they
can never be reached without going through the new call added to
run_builtin(), we might be able to remove these existing ones.

But in this commit, we do not bother to do so and leave these
precompose callsites as they are.  Because precompose() is
idempotent and can be called on an already precomposed string
safely, this is safer than removing existing calls without fully
vetting the callflows.

There is certainly room for cleanups - this change intends to be a bug fix.
Cleanups needs more tests in e.g. t/t3910-mac-os-precompose.sh, and should
be done in future commits.

[1] git-bugreport-2021-01-06-1209.txt (git can't deal with special characters)
[2] https://lore.kernel.org/git/A102844A-9501-4A86-854D-E3B387D378AA@icloud.com/

Reported-by: Daniel Troger <random_n0body@icloud.com>
Helped-By: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:09:37 -08:00
a534cf4f4d completion: treat "branch -D" the same way as "branch -d"
The former offers not just branches but tags as completion
candidates.

Mimic how "branch -d" limits its suggestion to branch names.

Reported-by: Paul Jolly <paul@myitcv.io>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-02 13:26:10 -08:00
ad6b5fefbd t5544: clarify 'hook works with partial clone' test
Apply a few leftover improvements from the review of ad5df6b782
(upload-pack.c: fix filter spec quoting bug).

1. Instead of enumerating objects reachable from HEAD, enumerate all
reachable objects, because HEAD has not special significance in this
test.

2. Instead of relying on the knowledge that "? in rev-list output
means partial clone", explicitly verify that there are no blobs with
cat-file.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-02 12:21:38 -08:00
7da7ef6d7a Merge branch 'mk/russian-translation'
Fix typo in Russian translation.

* mk/russian-translation:
  git-gui: fix typo in russian locale
2021-02-02 23:51:30 +05:30
413e96f41e git-gui: fix typo in russian locale
Fixed typo in russian locale: издекса -> индекса

Signed-off-by: Mikhail Klyushin <klyushinmisha@gmail.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2021-02-02 23:50:31 +05:30
be8fc53e36 pager: properly log pager exit code when signalled
When git invokes a pager that exits with non-zero the common case is
that we'll already return the correct SIGPIPE failure from git itself,
but the exit code logged in trace2 has always been incorrectly
reported[1]. Fix that and log the correct exit code in the logs.

Since this gives us something to test outside of our recently-added
tests needing a !MINGW prerequisite, let's refactor the test to run on
MINGW and actually check for SIGPIPE outside of MINGW.

The wait_or_whine() is only called with a true "in_signal" from from
finish_command_in_signal(), which in turn is only used in pager.c.

The "in_signal && !WIFEXITED(status)" case is not covered by
tests. Let's log the default -1 in that case for good measure.

1. The incorrect logging of the exit code in was seemingly copy/pasted
   into finish_command_in_signal() in ee4512ed48 (trace2: create new
   combined trace facility, 2019-02-22)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:15:58 -08:00
85db79a96e run-command: add braces for "if" block in wait_or_whine()
Add braces to an "if" block in the wait_or_whine() function. This
isn't needed now, but will make a subsequent commit easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:15:58 -08:00
c24b7f6736 pager: test for exit code with and without SIGPIPE
Add tests for how git behaves when the pager itself exits with
non-zero, as well as for us exiting with 141 when we're killed with
SIGPIPE due to the pager not consuming its output.

There is some recent discussion[1] about these semantics, but aside
from what we want to do in the future, we should have a test for the
current behavior.

This test construct is stolen from 7559a1be8a (unblock and unignore
SIGPIPE, 2014-09-18). The reason not to make the test itself depend on
the MINGW prerequisite is to make a subsequent commit easier to read.

1. https://lore.kernel.org/git/87o8h4omqa.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:15:58 -08:00
61ff12fa50 pager: refactor wait_for_pager() function
Refactor the wait_for_pager() function. Since 507d7804c0 (pager:
don't use unsafe functions in signal handlers, 2015-09-04) the
wait_for_pager() and wait_for_pager_atexit() callers diverged on more
than they shared.

Let's extract the common code into a new close_pager_fds() helper, and
move the parts unique to the only to callers to those functions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:15:58 -08:00
bc50d6c91f commit-graph: prepare commit graph
Before checking if the repository has a commit-graph loaded, be sure
to run prepare_commit_graph(). This is necessary because otherwise
the topo_levels slab is not initialized. As we compute topo_levels for
the new commits, we iterate further into the lower layers since the
first visit to each commit looks as though the topo_level is not
populated.

By properly initializing the topo_slab, we fix the previously broken
case of a split commit graph where a base layer has the
generation_data_overflow chunk.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:03:36 -08:00
fde55b0906 commit-graph: be extra careful about mixed generations
When upgrading to a commit-graph with corrected commit dates from
one without, there are a few things that need to be considered.

When computing generation numbers for the new commit-graph file that
expects to add the generation_data chunk with corrected commit
dates, we need to ensure that the 'generation' member of the
commit_graph_data struct is set to zero for these commits.

Unfortunately, the fallback to use topological level for generation
number when corrected commit dates are not available are causing us
harm here: parsing commits notices that read_generation_data is
false and populates 'generation' with the topological level.

The solution is to iterate through the commits, parse the commits
to populate initial values, then reset the generation values to
zero to trigger recalculation. This loop only occurs when the
existing commit-graph data has no corrected commit dates.

While this improves our situation somewhat, we have not completely
solved the issue for correctly computing generation numbers for mixed
layers. That follows in the next change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:03:36 -08:00
9c2c0a8256 commit-graph: compute generations separately
The compute_generation_numbers() method was introduced by 3258c663
(commit-graph: compute generation numbers, 2018-05-01) to compute what
is now known as "topological levels". These are still stored in the
commit-graph file for compatibility sake while c1a09119 (commit-graph:
implement corrected commit date, 2021-01-16) updated the method to also
compute the new version of generation numbers: corrected commit date.

It makes sense why these are grouped. They perform very similar walks of
the necessary commits and compute similar maximums over each parent.
However, having these two together conflates them in subtle ways that is
hard to separate.

In particular, the topo_level slab is used to store the topological
levels in all cases, but the commit_graph_data_at(c)->generation member
stores different values depending on the state of the existing
commit-graph file.

* If the existing commit-graph file has a "GDAT" chunk, then these
  values represent corrected commit dates.

* If the existing commit-graph file doesn't have a "GDAT" chunk, then
  these values are actually the topological levels.

This issue only occurs only when upgrading an existing commit-graph file
into one that has the "GDAT" chunk. The current change does not resolve
this upgrade problem, but splitting the implementation into two pieces
here helps with that process, which will follow in the next change.

The important thing this helps with is the case where the
num_generation_data_overflows was being incremented incorrectly,
triggering a write of the overflow chunk.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:03:36 -08:00
448a39e65d commit-graph: validate layers for generation data
We need to be extra careful that we don't use corrected
commit dates from any layer of a commit-graph chain if there is a
single commit-graph file that is missing the generation_data chunk.
Update validate_mixed_generation_chain() to correctly update each
layer to ignore the generation_data chunk in this case. It now also
returns 1 if all layers have a generation_data chunk. This return
value will be used in the next change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:03:36 -08:00
90cb1c47c7 commit-graph: always parse before commit_graph_data_at()
There is a subtle failure happening when computing corrected commit
dates with --split enabled. It requires a base layer needing the
generation_data_overflow chunk. Then, the next layer on top
erroneously thinks it needs an overflow chunk due to a bug leading
to recalculating all reachable generation numbers. The output of
the failure is

  BUG: commit-graph.c:1912: expected to write 8 bytes to
  chunk 47444f56, but wrote 0 instead

These "expected" 8 bytes are due to re-computing the corrected
commit date for the lower layer but the new layer does not need
any overflow.

Add a test to t5318-commit-graph.sh that demonstrates this bug. However,
it does not trigger consistently with the existing code.

The generation number data is stored in a slab and accessed by
commit_graph_data_at(). This data is initialized when parsing a commit,
but is otherwise used assuming it has been populated. The loop in
compute_generation_numbers() did not enforce that all reachable
commits were parsed and had correct values. This could lead to some
problems when writing a commit-graph with corrected commit dates based
on a commit-graph without them.

It has been difficult to identify the issue here because it was so hard
to reproduce. It relies on this uninitialized data having a non-zero
value, but also on specifically in a way that overwrites the existing
data.

This patch adds the extra parse to ensure the data is filled before we
compute the generation number of a commit. This triggers the new test
to fail because the generation number overflow count does not match
between this computation and the write for that chunk.

The actual fix will follow as the next few changes.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:03:36 -08:00
c4cc083169 commit-graph: use repo_parse_commit
The write_commit_graph_context has a repository pointer, so use it.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 21:03:35 -08:00
0fac156523 commit-reach: reduce requirements for remove_redundant()
Remove a comment at the beggining of remove_redundant() that mentions a
reordering of the input array to have the initial segment be the
independent commits and the final segment be the redundant commits.
While this behavior is followed in remove_redundant(), no callers rely
on that behavior.

Remove the final loop that copies this final segment and update the
comment to match the new behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-01 11:50:33 -08:00
076b444a62 worktree: teach list verbose mode
"git worktree list" annotates each worktree according to its state such
as "prunable" or "locked", however it is not immediately obvious why
these worktrees are being annotated. For prunable worktrees a reason
is available that is returned by should_prune_worktree() and for locked
worktrees a reason might be available provided by the user via `lock`
command.

Let's teach "git worktree list" a --verbose mode that outputs the reason
why the worktrees are being annotated. The reason is a text that can take
virtually any size and appending the text on the default columned format
will make it difficult to extend the command with other annotations and
not fit nicely on the screen. In order to address this shortcoming the
annotation is then moved to the next line indented followed by the reason
If the reason is not available the annotation stays on the same line as
the worktree itself.

The output of "git worktree list" with verbose becomes like so:

    $ git worktree list --verbose
    ...
    /path/to/locked-no-reason    acb124 [branch-a] locked
    /path/to/locked-with-reason  acc125 [branch-b]
        locked: worktree with a locked reason
    /path/to/prunable-reason     ace127 [branch-d]
        prunable: gitdir file points to non-existent location
    ...

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:40 -08:00
9b19a58f66 worktree: teach list to annotate prunable worktree
The "git worktree list" command shows the absolute path to the worktree,
the commit that is checked out, the name of the branch, and a "locked"
annotation if the worktree is locked, however, it does not indicate
whether the worktree is prunable.

The "prune" command will remove a worktree if it is prunable unless
`--dry-run` option is specified. This could lead to a worktree being
removed without the user realizing before it is too late, in case the
user forgets to pass --dry-run for instance. If the "list" command shows
which worktree is prunable, the user could verify before running
"git worktree prune" and hopefully prevents the working tree to be
removed "accidentally" on the worse case scenario.

Let's teach "git worktree list" to show when a worktree is a prunable
candidate for both default and porcelain format.

In the default format a "prunable" text is appended:

    $ git worktree list
    /path/to/main      aba123 [main]
    /path/to/linked    123abc [branch-a]
    /path/to/prunable  ace127 (detached HEAD) prunable

In the --porcelain format a prunable label is added followed by
its reason:

    $ git worktree list --porcelain
    ...
    worktree /path/to/prunable
    HEAD abc1234abc1234abc1234abc1234abc1234abc12
    detached
    prunable gitdir file points to non-existent location
    ...

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:35 -08:00
862c723d18 worktree: teach list --porcelain to annotate locked worktree
Commit c57b3367be (worktree: teach `list` to annotate locked worktree,
2020-10-11) taught "git worktree list" to annotate locked worktrees by
appending "locked" text to its output, however, this is not listed in
the --porcelain format.

Teach "list --porcelain" to do the same and add a "locked" attribute
followed by its reason, thus making both default and porcelain format
consistent. If the locked reason is not available then only "locked"
is shown.

The output of the "git worktree list --porcelain" becomes like so:

    $ git worktree list --porcelain
    ...
    worktree /path/to/locked
    HEAD 123abcdea123abcd123acbd123acbda123abcd12
    detached
    locked

    worktree /path/to/locked-with-reason
    HEAD abc123abc123abc123abc123abc123abc123abc1
    detached
    locked reason why it is locked
    ...

In porcelain mode, if the lock reason contains special characters
such as newlines, they are escaped with backslashes and the entire
reason is enclosed in double quotes. For example:

   $ git worktree list --porcelain
   ...
   locked "worktree's path mounted in\nremovable device"
   ...

Furthermore, let's update the documentation to state that some
attributes in the porcelain format might be listed alone or together
with its value depending whether the value is available or not. Thus
documenting the case of the new "locked" attribute.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:29 -08:00
47409e75f5 t2402: ensure locked worktree is properly cleaned up
c57b3367be (worktree: teach `list` to annotate locked worktree,
2020-10-11) introduced a new test to ensure locked worktrees are listed
with "locked" annotation. However, the test does not clean up after
itself as "git worktree prune" is not going to remove the locked worktree
in the first place. This not only leaves the test in an unclean state it
also potentially breaks following tests that rely on the
"git worktree list" output.

Let's fix that by unlocking the worktree before the "prune" command.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:24 -08:00
eb36135af7 worktree: teach worktree_lock_reason() to gently handle main worktree
worktree_lock_reason() aborts with an assertion failure when called on
the main worktree since locking the main worktree is nonsensical. Not
only is this behavior undocumented, thus callers might not even be aware
that the call could potentially crash the program, but it also forces
clients to be extra careful:

    if (!is_main_worktree(wt) && worktree_locked_reason(...))
        ...

Since we know that locking makes no sense in the context of the main
worktree, we can simply return false for the main worktree, thus making
client code less complex by eliminating the need for the callers to have
inside knowledge about the implementation:

    if (worktree_lock_reason(...))
        ...

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:20 -08:00
fc0c7d5e9e worktree: teach worktree to lazy-load "prunable" reason
Add worktree_prune_reason() to allow a caller to discover whether a
worktree is prunable and the reason that it is, much like
worktree_lock_reason() indicates whether a worktree is locked and the
reason for the lock. As with worktree_lock_reason(), retrieve the
prunable reason lazily and cache it in the `worktree` structure.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:16 -08:00
a29a8b7574 worktree: libify should_prune_worktree()
As part of teaching "git worktree list" to annotate worktree that is a
candidate for pruning, let's move should_prune_worktree() from
builtin/worktree.c to worktree.c in order to make part of the worktree
public API.

should_prune_worktree() knows how to select the given worktree for
pruning based on an expiration date, however the expiration value is
stored in a static file-scope variable and it is not local to the
function. In order to move the function, teach should_prune_worktree()
to take the expiration date as an argument and document the new
parameter that is not immediately obvious.

Also, change the function comment to clearly state that the worktree's
path is returned in `wtpath` argument.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:08 -08:00
2c0aa2ce2e doc/git-rebase: add documentation for fixup [-C|-c] options
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Marc Branchaud <marcnarc@xiplink.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
bae5b4aea5 rebase -i: teach --autosquash to work with amend!
If the commit subject starts with "amend!" then rearrange it like a
"fixup!" commit and replace `pick` command with `fixup -C` command,
which is used to fixup up the content if any and replaces the original
commit message with amend! commit's message.

Original-patch-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
1d410cd8c2 t3437: test script for fixup [-C|-c] options in interactive rebase
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
9e3cebd97c rebase -i: add fixup [-C | -c] command
Add options to `fixup` command to fixup both the commit contents and
message. `fixup -C` command is used to replace the original commit
message and `fixup -c`, additionally allows to edit the commit message.

Original-patch-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
71ee81cd9e sequencer: use const variable for commit message comments
This makes it easier to use and reuse the comments.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
ae70e34f23 sequencer: pass todo_item to do_pick_commit()
As an additional member of the structure todo_item will be required in
future commits pass the complete structure.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
7cdb968254 rebase -i: comment out squash!/fixup! subjects from squash message
When squashing commit messages the squash!/fixup! subjects are not of
interest so comment them out to stop them becoming part of the final
message.

This change breaks a bunch of --autosquash tests which rely on the
"squash! <subject>" line appearing in the final commit message. This is
addressed by adding a second line to the commit message of the "squash!
..." commits and testing for that.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
a093f0ba95 l10n: ru.po: update Russian translation
Kudos to Philipp Bartsch for whitespace fixes and his helper script[1].

[1]: https://git.grmr.de/phil/pocheck

Signed-off-by: Dimitriy Ryazantcev <dimitriy.ryazantcev@gmail.com>
2021-01-29 21:45:17 +02:00
6885cd7dc5 t5325: check both on-disk and in-memory reverse index
Right now, the test suite can be run with 'GIT_TEST_WRITE_REV_INDEX=1'
in the environment, which causes all operations which write a pack to
also write a .rev file.

To prepare for when that eventually becomes the default, we should
continue to test the in-memory reverse index, too, in order to avoid
losing existing coverage. Unfortunately, explicit existing coverage is
rather sparse, so only a basic test is added that compares the result of

    git rev-list --objects --no-object-names --all |
    git cat-file --batch-check='%(objectsize:disk) %(objectname)'

with and without an on-disk reverse index.

Suggested-by: Jeff King <peff@peff.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 22:51:51 -08:00
018b9deba5 pretty: lazy-load commit data when expanding user-format
When we expand a user-format, we try to avoid work that isn't necessary
for the output. For instance, we don't bother parsing the commit header
until we know we need the author, subject, etc.

But we do always load the commit object's contents from disk, even if
the format doesn't require it (e.g., just "%H"). Traditionally this
didn't matter much, because we'd have loaded it as part of the traversal
anyway, and we'd typically have those bytes attached to the commit
struct (or these days, cached in a commit-slab).

But when we have a commit-graph, we might easily get to the point of
pretty-printing a commit without ever having looked at the actual object
contents. We should push off that load (and reencoding) until we're
certain that it's needed.

I think the results of p4205 show the advantage pretty clearly (we serve
parent and tree oids out of the commit struct itself, so they benefit as
well):

  # using git.git as the test repo
  Test                          HEAD^             HEAD
  ----------------------------------------------------------------------
  4205.1: log with %H           0.40(0.39+0.01)   0.03(0.02+0.01) -92.5%
  4205.2: log with %h           0.45(0.44+0.01)   0.09(0.09+0.00) -80.0%
  4205.3: log with %T           0.40(0.39+0.00)   0.04(0.04+0.00) -90.0%
  4205.4: log with %t           0.46(0.46+0.00)   0.09(0.08+0.01) -80.4%
  4205.5: log with %P           0.39(0.39+0.00)   0.03(0.03+0.00) -92.3%
  4205.6: log with %p           0.46(0.46+0.00)   0.10(0.09+0.00) -78.3%
  4205.7: log with %h-%h-%h     0.52(0.51+0.01)   0.15(0.14+0.00) -71.2%
  4205.8: log with %an-%ae-%s   0.42(0.41+0.00)   0.42(0.41+0.01) +0.0%

  # using linux.git as the test repo
  Test                          HEAD^             HEAD
  ----------------------------------------------------------------------
  4205.1: log with %H           7.12(6.97+0.14)   0.76(0.65+0.11) -89.3%
  4205.2: log with %h           7.35(7.19+0.16)   1.30(1.19+0.11) -82.3%
  4205.3: log with %T           7.58(7.42+0.15)   1.02(0.94+0.08) -86.5%
  4205.4: log with %t           8.05(7.89+0.15)   1.55(1.41+0.13) -80.7%
  4205.5: log with %P           7.12(7.01+0.10)   0.76(0.69+0.07) -89.3%
  4205.6: log with %p           7.38(7.27+0.10)   1.32(1.20+0.12) -82.1%
  4205.7: log with %h-%h-%h     7.81(7.67+0.13)   1.79(1.67+0.12) -77.1%
  4205.8: log with %an-%ae-%s   7.90(7.74+0.15)   7.81(7.66+0.15) -1.1%

I added the final test to show where we don't improve (the 1% there is
just lucky noise), but also as a regression test to make sure we're not
doing anything stupid like loading the commit multiple times when there
are several placeholders that need it.

Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 14:07:35 -08:00
f7d42ceec5 rebase -i: do leave commit message intact in fixup! chains
In 6e98de72c0 (sequencer (rebase -i): add support for the 'fixup' and
'squash' commands, 2017-01-02), this developer introduced a change of
behavior by mistake: when encountering a `fixup!` commit (or multiple
`fixup!` commits) without any `squash!` commit thrown in, the final `git
commit` was invoked with `--cleanup=strip`. Prior to that commit, the
commit command had been called without that `--cleanup` option.

Since we explicitly read the original commit message from a file in that
case, there is really no sense in forcing that clean-up.

We actually need to actively suppress that clean-up lest a configured
`commit.cleanup` may interfere with what we want to do: leave the commit
message unchanged.

Reported-by: Vojtěch Knyttl <vojtech@knyt.tl>
Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 12:12:37 -08:00
30291525d9 t0000: consistently use single quotes for outer tests
When we use the sub-test helpers, we end up defining one shell snippet
inside another shell snippet. So if we use single-quotes for the outer
snippet, we have to use double-quotes within the inner snippet (it's
included as here-doc within the outer snippet, but using a single quote
would end the outer snippet early). Or vice versa we can use double
quotes for the outer snippet, but then single quotes in the inner.

We have some of each in the script, and neither is wrong. But it would
be nice to be consistent unless there is a good reason not to. Using
single quotes for the outer script is preferable, because it requires
less metacharacter quoting overall. For example, in:

  test_expect_success 'outer' '
	run_sub_test_lib_test ...  <<-\EOF
		echo $foo &&
		test_expect_success "inner" "
			echo \$bar
		"
	EOF
  '

we need only quote inside "inner", but not inside "outer" or the
here-doc. Whereas if we flip them, we have to quote in both places:

  test_expect_success 'outer' "
	run_sub_test_lib_test ...  <<-\EOF
		echo \$foo &&
		test_expect_success 'inner' '
			echo \$bar
		'
	EOF
  "

The exception is when we need a literal single-quote in an expected
output here-doc. There we can either use outer double-quotes, or just
use ${SQ} within the doc. I chose the latter for consistency (within
this test, but also with other test scripts that face the same problem).

There is one other interesting case, which is some tests that do:

  test_expect_success ... "
	do_something --run='"'!3'"'
  "

This is rather confusing to read, but is correct. The outer script sees
'!3' in single-quotes, as does the eval'd snippet. This is perhaps being
overly cautious. In many interactive shells, an exclamation triggers
history expansion even inside double quotes, but that is not generally
true in non-interactive shells.

There's some conflicting information here. Commit 784ce03d55 (t4216:
avoid unnecessary subshell in test_bloom_filters_not_used, 2020-05-19)
reports it as a problem with OpenBSD 6.7's /bin/sh. However, we have
many instances in this script of prereqs like !LAZY_TRUE, which haven't
been a problem. I left them un-escaped here to test out this theory.
It's much nicer if we can not worry about this as a portability issue,
so it's worth knowing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 12:06:26 -08:00
080e295248 t0000: run cleaning test inside sub-test
Our check of test_when_finished is done directly in the main script, and
if we failed to clean, we complain and exit immediately. It's nicer to
signal a test failure here, for a few reasons:

  - this gives better output to the user when run under a TAP harness
    like "prove"

  - constency; it's the only test left in the file that behaves this way

  - half of its "if" conditional is nonsense anyway; it picked up a
    reference to GIT_TEST_FAIL_PREREQS_INTERNAL in dfe1a17df9 (tests:
    add a special setup where prerequisites fail, 2019-05-13) along with
    its neighbors, even though it has nothing to do with that flag

We could actually do this without a sub-test at all, and just put our
two tests (one to do cleanup, and one to check that it happened) in the
main script. But doing it in a subtest is conceptually cleaner (from the
perspective of the main test script, we are checking only one thing),
and it remains consistent with the "cleanup when failing" test directly
after it, which has to happen in a sub-test (to avoid the main script
complaining of the failed test).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 12:06:26 -08:00
efd2600e6f t0000: run prereq tests inside sub-test
We test the behavior of prerequisites in t0000 by setting up fake ones
in the main test script, trying to run some tests, and then seeing if
those tests impacted the environment correctly. If they didn't, then we
write a message and manually call exit.

Instead, let's push these down into a sub-test, like many of the other
tests covering the framework itself. This has a few advantages:

  - it does not pollute the test output with mention of skipped tests
    (that we know are uninteresting -- the point of the test was to see
    that these are skipped).

  - when running in a TAP harness, we get a useful test failure message
    (whereas when the script exits early, a tool like "prove" simply
    says "Dubious, test returned 1").

  - we do not have to worry about different test environments, such as
    when GIT_TEST_FAIL_PREREQS_INTERNAL is set. Our sub-test helpers
    already give us a known environment.

  - the tests themselves are a bit easier to read, as we can just check
    the test-framework output to see what happened (and get the usual
    test_cmp diff if it failed)

A few notes on the implementation:

  - we could do one sub-test per each individual test_expect_success. I
    broke it up here into a few logical groups, as I think this makes it
    more readable

  - the original tests modified environment variables inside the test
    bodies. Instead, I've used "true" as the body of a test we expect to
    run and "false" otherwise. Technically this does not confirm that
    the body of the "true" test actually ran. We are trusting the
    framework output to believe that it truly ran, which is sufficient
    for these tests. And I think the end result is much simpler to
    follow.

  - the nested_prereq test uses a few bare "test -f" calls; I converted
    these to our usual test_path_is_* helpers while moving the code
    around.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 12:06:26 -08:00
03efadb774 t0000: keep clean-up tests together
We check that test_when_finished cleans up after a test, and that it
runs even after a failure. Those two were originally adjacent, but got
split apart by the new test added in 477dcaddb6 (tests: do not let lazy
prereqs inside `test_expect_*` turn off tracing, 2020-03-26), and then
further by more lazy-prereq tests. Let's move them back together.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 12:06:25 -08:00
8380dcd700 oid_pos(): access table through const pointers
When we are looking up an oid in an array, we obviously don't need to
write to the array. Let's mark it as const in the function interfaces,
as well as in the local variables we use to derference the void pointer
(note a few cases use pointers-to-pointers, so we mark everything
const).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 12:03:26 -08:00
45ee13b942 hash_pos(): convert to oid_pos()
All of our callers are actually looking up an object_id, not a bare
hash. Likewise, the arrays they are looking in are actual arrays of
object_id (not just raw bytes of hashes, as we might find in a pack
.idx; those are handled by bsearch_hash()).

Using an object_id gives us more type safety, and makes the callers
slightly shorter. It also gets rid of the word "sha1" from several
access functions, though we could obviously also rename those with
s/sha1/hash/.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 12:02:39 -08:00
680ff910b0 rerere: use strmap to store rerere directories
We store a struct for each directory we access under .git/rr-cache. The
structs are kept in an array sorted by the binary hash associated with
their name (and we do lookups with a binary search).

This works OK, but there are a few small downsides:

 - the amount of code isn't huge, but it's more than we'd need using one
   of our other stock data structures

 - the insertion into a sorted array is quadratic (though in practice
   it's unlikely anybody has enough conflicts for this to matter)

 - it's intimately tied to the representation of an object hash. This
   isn't a big deal, as the conflict ids we generate use the same hash,
   but it produces a few awkward bits (e.g., we are the only user of
   hash_pos() that is not using object_id).

Let's instead just treat the directory names as strings, and store them
in a strmap. This is less code, and removes the use of hash_pos().

Insertion is now non-quadratic, though we probably use a bit more
memory. Besides the hash table overhead, and storing hex bytes instead
of a binary hash, we actually store each name twice. Other code expects
to access the name of a rerere_dir struct from the struct itself, so we
need a copy there. But strmap keeps its own copy of the name, as well.

Using a bare hashmap instead of strmap means we could use the name for
both, but at the cost of extra code (e.g., our own comparison function).
Likewise, strmap has a feature to use a pointer to the in-struct name at
the cost of a little extra code. I didn't do either here, as simple code
seemed more important than squeezing out a few bytes of efficiency.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 11:26:20 -08:00
098c173f2b rerere: tighten rr-cache dirname check
We check only that get_sha1_hex() doesn't complain, which means we'd
match an all-hex name with trailing cruft after it. This probably
doesn't matter much in practice, since there shouldn't be anything else
in the rr-cache directory, but it could possibly cause us to mix up sha1
and sha256 entries (which also shouldn't be intermingled, but could be
leftovers from a repository conversion).

Note that "get_sha1_hex()" is a confusing historical name. It is
actually using the_hash_algo, so it would be sha256 in a sha256 repo.
We'll switch to using parse_oid_hex(), because that conveniently
advances our pointer. But it also gets rid of the sha1 name. Arguably
it's a little funny to use "object_id" here for something that isn't
actually naming an object, but it's unlikely to be a problem (and is
contained in a single function).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 11:25:43 -08:00
2bc1a87e42 rerere: check dirname format while iterating rr_cache directory
In rerere_gc(), we walk over the .git/rr_cache directory and create a
struct for each entry we find. We feed any name we get from readdir() to
find_rerere_dir(), which then calls get_sha1_hex() on it (since we use
the binary hash as a lookup key). If that fails (i.e., the directory
name is not what we expected), it returns NULL. But the comment in
find_rerere_dir() says "BUG".

It _would_ be a bug for the call from new_rerere_id_hex(), the only
other code path, to fail here; it's generating the hex internally. But
the call in rerere_gc() is using it say "is this a plausible directory
name".

Let's instead have rerere_gc() do its own "is this plausible" check.
That has two benefits:

  - we can now reliably BUG() inside find_rerere_dir(), which would
    catch bugs in the other code path (and we now will never return NULL
    from the function, which makes it easier to see that a rerere_id
    struct will always have a non-NULL "collection" field).

  - it makes the use of the binary hash an implementation detail of
    find_rerere_dir(), not known by callers. That will free us up to
    change it in a future patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 11:21:27 -08:00
98c431b6f9 commit_graft_pos(): take an oid instead of a bare hash
All of our callers have an object_id, and are just dereferencing the
hash field to pass to us. Let's take the actual object_id instead. We
still access the hash to pass to hash_pos, but it's a step in the right
direction.

This makes the callers slightly simpler, but also gets rid of the
untyped pointer, as well as the now-inaccurate name "sha1".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 11:21:07 -08:00
ad5df6b782 upload-pack.c: fix filter spec quoting bug
Fix a bug in upload-pack.c that occurs when you combine partial
clone and uploadpack.packObjectsHook. You can reproduce it as
follows:

    git clone -u 'git -c uploadpack.allowfilter '\
	'-c uploadpack.packobjectshook=env '\
	'upload-pack' --filter=blob:none --no-local \
	src.git dst.git

Be careful with the line endings because this has a long quoted
string as the -u argument.

The error I get when I run this is:

	Cloning into '/tmp/broken'...
	remote: fatal: invalid filter-spec ''blob:none''
	error: git upload-pack: git-pack-objects died with error.
	fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
	remote: aborting due to possible repository corruption on the remote side.
	fatal: early EOF
	fatal: index-pack failed

The problem is caused by unneeded quoting.

This bug was already present in 10ac85c785 (upload-pack: add object
filtering for partial clone, 2017-12-08) when the server side filter
support was introduced.  In fact, in 10ac85c785 this was broken
regardless of uploadpack.packObjectsHook. Then in 0b6069fe0a
(fetch-pack: test support excluding large blobs, 2017-12-08) the
quoting was removed but only behind a conditional that depends on
whether uploadpack.packObjectsHook is set.

Because uploadpack.packObjectsHook is apparently rarely used, nobody
noticed the problematic quoting could still happen.

Remove the conditional quoting and add a test for partial clone in
t5544-pack-objects-hook.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 09:40:24 -08:00
765dc16888 git-compat-util: always enable variadic macros
We allow variadic macros in the code base, but only if there is fallback
code for platforms that lack it. This leads to some annoyances:

  - the code is more complicated because of the fallbacks (e.g.,
    trace_printf(), etc, is implemented twice with a set of parallel
    wrappers).

  - some constructs are just impossible and we've had to live without
    them (e.g., a cross between FLEX_ALLOC and xstrfmt)

Since this feature is present in C99, we may be able to start counting
on it being available everywhere. Let's start with a weather balloon
patch to find out.

This patch makes the absolute minimal change by always setting
HAVE_VARIADIC_MACROS. If somebody runs into a platform where it's a
problem, they can undo it by commenting out the define. Likewise, if we
have to revert this, it would be quite unlikely to cause conflicts.

Once we feel comfortable that this is the right direction, then we can
start ripping out all the spots that actually look at the flag, and
removing the dead code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-27 22:14:37 -08:00
679b5916cd range-diff/format-patch: refactor check for commit range
Currently, when called with exactly two arguments, `git range-diff`
tests for a literal `..` in each of the two. Likewise, the argument
provided via `--range-diff` to `git format-patch` is checked in the same
manner.

However, `<commit>^!` is a perfectly valid commit range, equivalent to
`<commit>^..<commit>` according to the `SPECIFYING RANGES` section of
gitrevisions[7].

In preparation for allowing more sophisticated ways to specify commit
ranges, let's refactor the check into its own function.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-27 22:01:49 -08:00
134768cf53 test-lib: prevent '--stress-jobs=X' from being ignored
'./t1234-foo.sh --stress-jobs=X ...' is supposed to run that test
script in X parallel jobs, but the number of jobs specified on the
command line is entirely ignored if other '--stress'-related options
follow.  I.e. both './t1234-foo.sh --stress-jobs=X --stress-limit=Y'
and './t1234-foo.sh --stress-jobs=X --stress' fall back to using twice
the number of CPUs parallel jobs instead.

The former has been broken since commit de69e6f6c9 (tests: let
--stress-limit=<N> imply --stress, 2019-03-03) [1], which started to
unconditionally overwrite the $stress variable holding the specified
number of jobs in its effort to imply '--stress'.  The latter has been
broken since f545737144 (tests: introduce --stress-jobs=<N>,
2019-03-03), because it didn't consider that handling '--stress' will
overwrite that variable as well.

We could fix this by being more careful about (over)writing that
$stress variable and checking first whether it has already been set.
But I think it's cleaner to use a dedicated variable to hold the
number of specified parallel jobs, so let's do that instead.

[1] In de69e6f6c9 there was no '--stress-jobs=X' option yet, the
    number of parallel jobs had to be specified via '--stress=X', so,
    strictly speaking, de69e6f6c9 broke './t1234-foo.sh --stress=X
    --stress-limit=Y'.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-26 17:58:33 -08:00
15c9649730 grep/log: remove hidden --debug and --grep-debug options
Remove the hidden "grep --debug" and "log --grep-debug" options added
in 17bf35a3c7 (grep: teach --debug option to dump the parse tree,
2012-09-13).

At the time these options seem to have been intended to go along with
a documentation discussion and to help the author of relevant tests to
perform ad-hoc debugging on them[1].

Reasons to want this gone:

 1. They were never documented, and the only (rather trivial) use of
    them in our own codebase for testing is something I removed back
    in e01b4dab01 (grep: change non-ASCII -i test to stop using
    --debug, 2017-05-20).

 2. Googling around doesn't show any in-the-wild uses I could dig up,
    and on the Git ML the only mentions after the original discussion
    seem to have been when they came up in unrelated diff contexts, or
    that test commit of mine.

 3. An exception to that is c581e4a749 (grep: under --debug, show
    whether PCRE JIT is enabled, 2019-08-18) where we added the
    ability to dump out when PCREv2 has the JIT in effect.

    The combination of that and my earlier b65abcafc7 (grep: use PCRE
    v2 for optimized fixed-string search, 2019-07-01) means Git prints
    this out in its most common in-the-wild configuration:

        $ git log  --grep-debug --grep=foo --grep=bar --grep=baz --all-match
        pcre2_jit_on=1
        pcre2_jit_on=1
        pcre2_jit_on=1
        [all-match]
        (or
         pattern_body<body>foo
         (or
          pattern_body<body>bar
          pattern_body<body>baz
         )
        )

        $ git grep --debug \( -e foo --and -e bar \) --or -e baz
        pcre2_jit_on=1
        pcre2_jit_on=1
        pcre2_jit_on=1
        (or
         (and
          patternfoo
          patternbar
         )
         patternbaz
        )

I.e. for each pattern we're considering for the and/or/--all-match
etc. debugging we'll now diligently spew out another identical line
saying whether the PCREv2 JIT is on or not.

I think that nobody's complained about that rather glaringly obviously
bad output says something about how much this is used, i.e. it's
not.

The need for this debugging aid for the composed grep/log patterns
seems to have passed, and the desire to dump the JIT config seems to
have been another one-off around the time we had JIT-related issues on
the PCREv2 codepath. That the original author of this debugging
facility seemingly hasn't noticed the bad output since then[2] is
probably some indicator.

1. https://lore.kernel.org/git/cover.1347615361.git.git@drmicha.warpmail.net/
2. https://lore.kernel.org/git/xmqqk1b8x0ac.fsf@gitster-ct.c.googlers.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-26 11:36:20 -08:00
ec8e7760ac pack-revindex: ensure that on-disk reverse indexes are given precedence
When an on-disk reverse index exists, there is no need to generate one
in memory. In fact, doing so can be slow, and require large amounts of
the heap.

Let's make sure that we treat the on-disk reverse index with precedence
(i.e., that when it exists, we don't bother trying to generate an
equivalent one in memory) by teaching Git how to conditionally die()
when generating a reverse index in memory.

Then, add a test to ensure that when (a) an on-disk reverse index
exists, and (b) when setting GIT_TEST_REV_INDEX_DIE_IN_MEMORY, that we
do not die, implying that we read from the on-disk one.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:44 -08:00
e8c58f894b t: support GIT_TEST_WRITE_REV_INDEX
Add a new option that unconditionally enables the pack.writeReverseIndex
setting in order to run the whole test suite in a mode that generates
on-disk reverse indexes. Additionally, enable this mode in the second
run of tests under linux-gcc in 'ci/run-build-and-tests.sh'.

Once on-disk reverse indexes are proven out over several releases, we
can change the default value of that configuration to 'true', and drop
this patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:44 -08:00
35a8a3547a t: prepare for GIT_TEST_WRITE_REV_INDEX
In the next patch, we'll add support for unconditionally enabling the
'pack.writeReverseIndex' setting with a new GIT_TEST_WRITE_REV_INDEX
environment variable.

This causes a little bit of fallout with tests that, for example,
compare the list of files in the pack directory being unprepared to see
.rev files in its output.

Those locations can be cleaned up to look for specific file extensions,
rather than take everything in the pack directory (for instance) and
then grep out unwanted items.

Once the pack.writeReverseIndex option has been thoroughly
tested, we will default it to 'true', removing GIT_TEST_WRITE_REV_INDEX,
and making it possible to revert this patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:44 -08:00
1615c567b8 Documentation/config/pack.txt: advertise 'pack.writeReverseIndex'
Now that the pack.writeReverseIndex configuration is respected in both
'git index-pack' and 'git pack-objects' (and therefore, all of their
callers), we can safely advertise it for use in the git-config manual.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:44 -08:00
c97733435a builtin/pack-objects.c: respect 'pack.writeReverseIndex'
Now that we have an implementation that can write the new reverse index
format, enable writing a .rev file in 'git pack-objects' by consulting
the pack.writeReverseIndex configuration variable.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
e37d0b8730 builtin/index-pack.c: write reverse indexes
Teach 'git index-pack' to optionally write and verify reverse index with
'--[no-]rev-index', as well as respecting the 'pack.writeReverseIndex'
configuration option.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
84d544943c builtin/index-pack.c: allow stripping arbitrary extensions
To derive the filename for a .idx file, 'git index-pack' uses
derive_filename() to strip the '.pack' suffix and add the new suffix.

Prepare for stripping off suffixes other than '.pack' by making the
suffix to strip a parameter of derive_filename(). In order to make this
consistent with the "suffix" parameter which does not begin with a ".",
an additional check in derive_filename.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
8ef50d9958 pack-write.c: prepare to write 'pack-*.rev' files
This patch prepares for callers to be able to write reverse index files
to disk.

It adds the necessary machinery to write a format-compliant .rev file
from within 'write_rev_file()', which is called from
'finish_tmp_packfile()'.

Similar to the process by which the reverse index is computed in memory,
these new paths also have to sort a list of objects by their offsets
within a packfile. These new paths use a qsort() (as opposed to a radix
sort), since our specialized radix sort requires a full revindex_entry
struct per object, which is more memory than we need to allocate.

The qsort is obviously slower, but the theoretical slowdown would
require a repository with a large amount of objects, likely implying
that the time spent in, say, pack-objects during a repack would dominate
the overall runtime.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
2f4ba2a867 packfile: prepare for the existence of '*.rev' files
Specify the format of the on-disk reverse index 'pack-*.rev' file, as
well as prepare the code for the existence of such files.

The reverse index maps from pack relative positions (i.e., an index into
the array of object which is sorted by their offsets within the
packfile) to their position within the 'pack-*.idx' file. Today, this is
done by building up a list of (off_t, uint32_t) tuples for each object
(the off_t corresponding to that object's offset, and the uint32_t
corresponding to its position in the index). To convert between pack and
index position quickly, this array of tuples is radix sorted based on
its offset.

This has two major drawbacks:

First, the in-memory cost scales linearly with the number of objects in
a pack.  Each 'struct revindex_entry' is sizeof(off_t) +
sizeof(uint32_t) + padding bytes for a total of 16.

To observe this, force Git to load the reverse index by, for e.g.,
running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking
for a single object in a fresh clone of the kernel, Git needs to
allocate 120+ MB of memory in order to hold the reverse index in memory.

Second, the cost to sort also scales with the size of the pack.
Luckily, this is a linear function since 'load_pack_revindex()' uses a
radix sort, but this cost still must be paid once per pack per process.

As an example, it takes ~60x longer to print the _size_ of an object as
it does to print that entire object's _contents_:

  Benchmark #1: git.compile cat-file --batch <obj
    Time (mean ± σ):       3.4 ms ±   0.1 ms    [User: 3.3 ms, System: 2.1 ms]
    Range (min … max):     3.2 ms …   3.7 ms    726 runs

  Benchmark #2: git.compile cat-file --batch-check="%(objectsize:disk)" <obj
    Time (mean ± σ):     210.3 ms ±   8.9 ms    [User: 188.2 ms, System: 23.2 ms]
    Range (min … max):   193.7 ms … 224.4 ms    13 runs

Instead, avoid computing and sorting the revindex once per process by
writing it to a file when the pack itself is generated.

The format is relatively straightforward. It contains an array of
uint32_t's, the length of which is equal to the number of objects in the
pack.  The ith entry in this table contains the index position of the
ith object in the pack, where "ith object in the pack" is determined by
pack offset.

One thing that the on-disk format does _not_ contain is the full (up to)
eight-byte offset corresponding to each object. This is something that
the in-memory revindex contains (it stores an off_t in 'struct
revindex_entry' along with the same uint32_t that the on-disk format
has). Omit it in the on-disk format, since knowing the index position
for some object is sufficient to get a constant-time lookup in the
pack-*.idx file to ask for an object's offset within the pack.

This trades off between the on-disk size of the 'pack-*.rev' file for
runtime to chase down the offset for some object. Even though the lookup
is constant time, the constant is heavier, since it can potentially
involve two pointer walks in v2 indexes (one to access the 4-byte offset
table, and potentially a second to access the double wide offset table).

Consider trying to map an object's pack offset to a relative position
within that pack. In a cold-cache scenario, more page faults occur while
switching between binary searching through the reverse index and
searching through the *.idx file for an object's offset. Sure enough,
with a cold cache (writing '3' into '/proc/sys/vm/drop_caches' after
'sync'ing), printing out the entire object's contents is still
marginally faster than printing its size:

  Benchmark #1: git.compile cat-file --batch-check="%(objectsize:disk)" <obj >/dev/null
    Time (mean ± σ):      22.6 ms ±   0.5 ms    [User: 2.4 ms, System: 7.9 ms]
    Range (min … max):    21.4 ms …  23.5 ms    41 runs

  Benchmark #2: git.compile cat-file --batch <obj >/dev/null
    Time (mean ± σ):      17.2 ms ±   0.7 ms    [User: 2.8 ms, System: 5.5 ms]
    Range (min … max):    15.6 ms …  18.2 ms    45 runs

(Numbers taken in the kernel after cheating and using the next patch to
generate a reverse index). There are a couple of approaches to improve
cold cache performance not pursued here:

  - We could include the object offsets in the reverse index format.
    Predictably, this does result in fewer page faults, but it triples
    the size of the file, while simultaneously duplicating a ton of data
    already available in the .idx file. (This was the original way I
    implemented the format, and it did show
    `--batch-check='%(objectsize:disk)'` winning out against `--batch`.)

    On the other hand, this increase in size also results in a large
    block-cache footprint, which could potentially hurt other workloads.

  - We could store the mapping from pack to index position in more
    cache-friendly way, like constructing a binary search tree from the
    table and writing the values in breadth-first order. This would
    result in much better locality, but the price you pay is trading
    O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you
    can no longer directly index the table).

So, neither of these approaches are taken here. (Thankfully, the format
is versioned, so we are free to pursue these in the future.) But, cold
cache performance likely isn't interesting outside of one-off cases like
asking for the size of an object directly. In real-world usage, Git is
often performing many operations in the revindex (i.e., asking about
many objects rather than a single one).

The trade-off is worth it, since we will avoid the vast majority of the
cost of generating the revindex that the extra pointer chase will look
like noise in the following patch's benchmarks.

This patch describes the format and prepares callers (like in
pack-revindex.c) to be able to read *.rev files once they exist. An
implementation of the writer will appear in the next patch, and callers
will gradually begin to start using the writer in the patches that
follow after that.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
e6362826a0 The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 14:19:20 -08:00
b7bb322cba Merge branch 'ab/mailmap-fixup'
Follow-up fixes and improvements to ab/mailmap topic.

* ab/mailmap-fixup:
  t4203: make blame output massaging more robust
  mailmap doc: use correct environment variable 'GIT_WORK_TREE'
  t4203: stop losing return codes of git commands
  test-lib-functions.sh: fix usage for test_commit()
2021-01-25 14:19:20 -08:00
bcaaf972e6 Merge branch 'tb/pack-revindex-api'
Abstract accesses to in-core revindex that allows enumerating
objects stored in a packfile in the order they appear in the pack,
in preparation for introducing an on-disk precomputed revindex.

* tb/pack-revindex-api: (21 commits)
  for_each_object_in_pack(): clarify pack vs index ordering
  pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()'
  pack-revindex: hide the definition of 'revindex_entry'
  pack-revindex: remove unused 'find_revindex_position()'
  pack-revindex: remove unused 'find_pack_revindex()'
  builtin/gc.c: guess the size of the revindex
  for_each_object_in_pack(): convert to new revindex API
  unpack_entry(): convert to new revindex API
  packed_object_info(): convert to new revindex API
  retry_bad_packed_offset(): convert to new revindex API
  get_delta_base_oid(): convert to new revindex API
  rebuild_existing_bitmaps(): convert to new revindex API
  try_partial_reuse(): convert to new revindex API
  get_size_by_pos(): convert to new revindex API
  show_objects_for_type(): convert to new revindex API
  bitmap_position_packfile(): convert to new revindex API
  check_object(): convert to new revindex API
  write_reused_pack_verbatim(): convert to new revindex API
  write_reused_pack_one(): convert to new revindex API
  write_reuse_object(): convert to new revindex API
  ...
2021-01-25 14:19:20 -08:00
381dac2349 Merge branch 'ab/coc-update-to-2.0'
Update the Code-of-conduct to version 2.0 from the upstream (we've
been using version 1.4).

* ab/coc-update-to-2.0:
  CoC: update to version 2.0 + local changes
  CoC: explicitly take any whitespace breakage
  CoC: Update word-wrapping to match upstream
2021-01-25 14:19:19 -08:00
294e949fa2 Merge branch 'ps/config-env-pairs'
Introduce two new ways to feed configuration variable-value pairs
via environment variables, and tweak the way GIT_CONFIG_PARAMETERS
encodes variable/value pairs to make it more robust.

* ps/config-env-pairs:
  config: allow specifying config entries via envvar pairs
  environment: make `getenv_safe()` a public function
  config: store "git -c" variables using more robust format
  config: parse more robust format in GIT_CONFIG_PARAMETERS
  config: extract function to parse config pairs
  quote: make sq_dequote_step() a public function
  config: add new way to pass config via `--config-env`
  git: add `--super-prefix` to usage string
2021-01-25 14:19:19 -08:00
7eefa1349b Merge branch 'cc/write-promisor-file'
A bit of code refactoring.

* cc/write-promisor-file:
  pack-write: die on error in write_promisor_file()
  fetch-pack: refactor writing promisor file
  fetch-pack: rename helper to create_promisor_file()
2021-01-25 14:19:19 -08:00
8b48981987 Merge branch 'jx/bundle'
"git bundle" learns "--stdin" option to read its refs from the
standard input.  Also, it now does not lose refs whey they point
at the same object.

* jx/bundle:
  bundle: arguments can be read from stdin
  bundle: lost objects when removing duplicate pendings
  test: add helper functions for git-bundle
2021-01-25 14:19:19 -08:00
42342b3ee6 Merge branch 'ab/mailmap'
Clean-up docs, codepaths and tests around mailmap.

* ab/mailmap: (22 commits)
  shortlog: remove unused(?) "repo-abbrev" feature
  mailmap doc + tests: document and test for case-insensitivity
  mailmap tests: add tests for empty "<>" syntax
  mailmap tests: add tests for whitespace syntax
  mailmap tests: add a test for comment syntax
  mailmap doc + tests: add better examples & test them
  tests: refactor a few tests to use "test_commit --append"
  test-lib functions: add an --append option to test_commit
  test-lib functions: add --author support to test_commit
  test-lib functions: document arguments to test_commit
  test-lib functions: expand "test_commit" comment template
  mailmap: test for silent exiting on missing file/blob
  mailmap tests: get rid of overly complex blame fuzzing
  mailmap tests: add a test for "not a blob" error
  mailmap tests: remove redundant entry in test
  mailmap tests: improve --stdin tests
  mailmap tests: modernize syntax & test idioms
  mailmap tests: use our preferred whitespace syntax
  mailmap doc: start by mentioning the comment syntax
  check-mailmap doc: note config options
  ...
2021-01-25 14:19:19 -08:00
60ecad090d Merge branch 'ps/fetch-atomic'
"git fetch" learns to treat ref updates atomically in all-or-none
fashion, just like "git push" does, with the new "--atomic" option.

* ps/fetch-atomic:
  fetch: implement support for atomic reference updates
  fetch: allow passing a transaction to `s_update_ref()`
  fetch: refactor `s_update_ref` to use common exit path
  fetch: use strbuf to format FETCH_HEAD updates
  fetch: extract writing to FETCH_HEAD
2021-01-25 14:19:19 -08:00
b69bed22c5 Merge branch 'jk/log-cherry-pick-duplicate-patches'
When more than one commit with the same patch ID appears on one
side, "git log --cherry-pick A...B" did not exclude them all when a
commit with the same patch ID appears on the other side.  Now it
does.

* jk/log-cherry-pick-duplicate-patches:
  patch-ids: handle duplicate hashmap entries
2021-01-25 14:19:19 -08:00
27d7c8599b Merge branch 'js/default-branch-name-tests-final-stretch'
Prepare tests not to be affected by the name of the default branch
"git init" creates.

* js/default-branch-name-tests-final-stretch: (28 commits)
  tests: drop prereq `PREPARE_FOR_MAIN_BRANCH` where no longer needed
  t99*: adjust the references to the default branch name "main"
  tests(git-p4): transition to the default branch name `main`
  t9[5-7]*: adjust the references to the default branch name "main"
  t9[0-4]*: adjust the references to the default branch name "main"
  t8*: adjust the references to the default branch name "main"
  t7[5-9]*: adjust the references to the default branch name "main"
  t7[0-4]*: adjust the references to the default branch name "main"
  t6[4-9]*: adjust the references to the default branch name "main"
  t64*: preemptively adjust alignment to prepare for `master` -> `main`
  t6[0-3]*: adjust the references to the default branch name "main"
  t5[6-9]*: adjust the references to the default branch name "main"
  t55[4-9]*: adjust the references to the default branch name "main"
  t55[23]*: adjust the references to the default branch name "main"
  t551*: adjust the references to the default branch name "main"
  t550*: adjust the references to the default branch name "main"
  t5503: prepare aligned comment for replacing `master` with `main`
  t5[0-4]*: adjust the references to the default branch name "main"
  t5323: prepare centered comment for `master` -> `main`
  t4*: adjust the references to the default branch name "main"
  ...
2021-01-25 14:19:18 -08:00
440acfbe0c Merge branch 'dl/reflog-with-single-entry'
After expiring a reflog and making a single commit, the reflog for
the branch would record a single entry that knows both @{0} and
@{1}, but we failed to answer "what commit were we on?", i.e. @{1}

* dl/reflog-with-single-entry:
  refs: allow @{n} to work with n-sized reflog
  refs: factor out set_read_ref_cutoffs()
2021-01-25 14:19:18 -08:00
0806279428 Merge branch 'sj/untracked-files-in-submodule-directory-is-not-dirty'
"git diff" showed a submodule working tree with untracked cruft as
"Submodule commit <objectname>-dirty", but a natural expectation is
that the "-dirty" indicator would align with "git describe --dirty",
which does not consider having untracked files in the working tree
as source of dirtiness.  The inconsistency has been fixed.

* sj/untracked-files-in-submodule-directory-is-not-dirty:
  diff: do not show submodule with untracked files as "-dirty"
2021-01-25 14:19:18 -08:00
dfcd905069 Merge branch 'jc/deprecate-pack-redundant'
Warn loudly when the "pack-redundant" command, which has been left
stale with almost unusable performance issues, gets used, as we no
longer want to recommend its use (instead just "repack -d" instead).

* jc/deprecate-pack-redundant:
  pack-redundant: gauge the usage before proposing its removal
2021-01-25 14:19:18 -08:00
c7b1aaf6d6 Merge branch 'jk/forbid-lf-in-git-url'
Newline characters in the host and path part of git:// URL are
now forbidden.

* jk/forbid-lf-in-git-url:
  fsck: reject .gitmodules git:// urls with newlines
  git_connect_git(): forbid newlines in host and path
2021-01-25 14:19:17 -08:00
9e409d7e07 Merge branch 'ab/branch-sort'
The implementation of "git branch --sort" wrt the detached HEAD
display has always been hacky, which has been cleaned up.

* ab/branch-sort:
  branch: show "HEAD detached" first under reverse sort
  branch: sort detached HEAD based on a flag
  ref-filter: move ref_sorting flags to a bitfield
  ref-filter: move "cmp_fn" assignment into "else if" arm
  ref-filter: add braces to if/else if/else chain
  branch tests: add to --sort tests
  branch: change "--local" to "--list" in comment
2021-01-25 14:19:17 -08:00
a5ac31b5b1 Merge branch 'en/diffcore-rename'
File-level rename detection updates.

* en/diffcore-rename:
  diffcore-rename: remove unnecessary duplicate entry checks
  diffcore-rename: accelerate rename_dst setup
  diffcore-rename: simplify and accelerate register_rename_src()
  t4058: explore duplicate tree entry handling in a bit more detail
  t4058: add more tests and documentation for duplicate tree entry handling
  diffcore-rename: reduce jumpiness in progress counters
  diffcore-rename: simplify limit check
  diffcore-rename: avoid usage of global in too_many_rename_candidates()
  diffcore-rename: rename num_create to num_destinations
2021-01-25 14:19:17 -08:00
58e2ce9112 Merge branch 'ma/more-opaque-lock-file'
Code clean-up.

* ma/more-opaque-lock-file:
  read-cache: try not to peek into `struct {lock_,temp}file`
  refs/files-backend: don't peek into `struct lock_file`
  midx: don't peek into `struct lock_file`
  commit-graph: don't peek into `struct lock_file`
  builtin/gc: don't peek into `struct lock_file`
2021-01-25 14:19:17 -08:00
2856089e36 Merge branch 'en/merge-ort-3'
Rename detection is added to the "ORT" merge strategy.

* en/merge-ort-3:
  merge-ort: add implementation of type-changed rename handling
  merge-ort: add implementation of normal rename handling
  merge-ort: add implementation of rename collisions
  merge-ort: add implementation of rename/delete conflicts
  merge-ort: add implementation of both sides renaming differently
  merge-ort: add implementation of both sides renaming identically
  merge-ort: add basic outline for process_renames()
  merge-ort: implement compare_pairs() and collect_renames()
  merge-ort: implement detect_regular_renames()
  merge-ort: add initial outline for basic rename detection
  merge-ort: add basic data structures for handling renames
2021-01-25 14:19:17 -08:00
c7d6d419b0 Merge branch 'ab/mktag'
"git mktag" validates its input using its own rules before writing
a tag object---it has been updated to share the logic with "git
fsck".

* ab/mktag: (23 commits)
  mktag: add a --[no-]strict option
  mktag: mark strings for translation
  mktag: convert to parse-options
  mktag: allow omitting the header/body \n separator
  mktag: allow turning off fsck.extraHeaderEntry
  fsck: make fsck_config() re-usable
  mktag: use fsck instead of custom verify_tag()
  mktag: use puts(str) instead of printf("%s\n", str)
  mktag: remove redundant braces in one-line body "if"
  mktag: use default strbuf_read() hint
  mktag tests: test verify_object() with replaced objects
  mktag tests: improve verify_object() test coverage
  mktag tests: test "hash-object" compatibility
  mktag tests: stress test whitespace handling
  mktag tests: run "fsck" after creating "mytag"
  mktag tests: don't create "mytag" twice
  mktag tests: don't redirect stderr to a file needlessly
  mktag tests: remove needless SHA-1 hardcoding
  mktag tests: use "test_commit" helper
  mktag tests: don't needlessly use a subshell
  ...
2021-01-25 14:19:17 -08:00
95ca1f987e grep/pcre2: better support invalid UTF-8 haystacks
Improve the support for invalid UTF-8 haystacks given a non-ASCII
needle when using the PCREv2 backend.

This is a more complete fix for a bug I started to fix in
870eea8166 (grep: do not enter PCRE2_UTF mode on fixed matching,
2019-07-26), now that PCREv2 has the PCRE2_MATCH_INVALID_UTF mode we
can make use of it.

This fixes the sort of case described in 8a5999838e (grep: stess test
PCRE v2 on invalid UTF-8 data, 2019-07-26), i.e.:

    - The subject string is non-ASCII (e.g. "ævar")
    - We're under a is_utf8_locale(), e.g. "en_US.UTF-8", not "C"
    - We are using --ignore-case, or we're a non-fixed pattern

If those conditions were satisfied and we matched found non-valid
UTF-8 data PCREv2 might bark on it, in practice this only happened
under the JIT backend (turned on by default on most platforms).

Ultimately this fixes a "regression" in b65abcafc7 ("grep: use PCRE v2
for optimized fixed-string search", 2019-07-01), I'm putting that in
scare-quotes because before then we wouldn't properly support these
complex case-folding, locale etc. cases either, it just broke in
different ways.

There was a bug related to this the PCRE2_NO_START_OPTIMIZE flag fixed
in PCREv2 10.36. It can be worked around by setting the
PCRE2_NO_START_OPTIMIZE flag. Let's do that in those cases, and add
tests for the bug.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-24 16:09:17 -08:00
a4fea08b6e grep/pcre2 tests: don't rely on invalid UTF-8 data test
As noted in [1] when I originally added this test in [2] the test was
completely broken as it lacked a redirect[3]. I now think this whole
thing is overly fragile. Let's only test if we have a segfault here.

Before this the first test's "test_cmp" was pretty meaningless. We
were only testing if PCREv2 was so broken that it would spew out
something completely unrelated on stdout, which isn't very plausible.

In the second test we're relying on PCREv2 forever holding to the
current behavior of the PCRE_UTF8 flag, as opposed to learning some
optimistic graceful fallback to PCRE2_MATCH_INVALID_UTF in the
future. If that happens having this test broken under bisecting would
suck.

A follow-up commit will actually test this case in a meaningful way
under the PCRE2_MATCH_INVALID_UTF flag. Let's run this one
unconditionally, and just make sure we don't segfault.

1. e714b898c6 (t7812: expect failure for grep -i with invalid UTF-8
   data, 2019-11-29)
2. 8a5999838e (grep: stess test PCRE v2 on invalid UTF-8 data,
   2019-07-26)
3. c74b3cbb83 (t7812: add missing redirects, 2019-11-26)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-24 16:09:15 -08:00
557ac0350d merge-ort: begin performance work; instrument with trace2_region_* calls
Add some timing instrumentation for both merge-ort and diffcore-rename;
I used these to measure and optimize performance in both, and several
future patch series will build on these to reduce the timings of some
select testcases.

=== Setup ===

The primary testcase I used involved rebasing a random topic in the
linux kernel (consisting of 35 patches) against an older version.  I
added two variants, one where I rename a toplevel directory, and another
where I only rebase one patch instead of the whole topic.  The setup is
as follows:

  $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  $ git branch hwmon-updates fd8bdb23b91876ac1e624337bb88dc1dcc21d67e
  $ git branch hwmon-just-one fd8bdb23b91876ac1e624337bb88dc1dcc21d67e~34
  $ git branch base 4703d9119972bf586d2cca76ec6438f819ffa30e
  $ git switch -c 5.4-renames v5.4
  $ git mv drivers pilots  # Introduce over 26,000 renames
  $ git commit -m "Rename drivers/ to pilots/"
  $ git config merge.renameLimit 30000
  $ git config merge.directoryRenames true

=== Testcases ===

Now with REBASE standing for either "git rebase [--merge]" (using
merge-recursive) or "test-tool fast-rebase" (using merge-ort), the
testcases are:

Testcase #1: no-renames

  $ git checkout v5.4^0
  $ REBASE --onto HEAD base hwmon-updates

  Note: technically the name is misleading; there are some renames, but
  very few.  Rename detection only takes about half the overall time.

Testcase #2: mega-renames

  $ git checkout 5.4-renames^0
  $ REBASE --onto HEAD base hwmon-updates

Testcase #3: just-one-mega

  $ git checkout 5.4-renames^0
  $ REBASE --onto HEAD base hwmon-just-one

=== Timing results ===

Overall timings, using hyperfine (1 warmup run, 3 runs for mega-renames,
10 runs for the other two cases):

                       merge-recursive           merge-ort
    no-renames:       18.912 s ±  0.174 s    14.263 s ±  0.053 s
    mega-renames:   5964.031 s ± 10.459 s  5504.231 s ±  5.150 s
    just-one-mega:   149.583 s ±  0.751 s   158.534 s ±  0.498 s

A single re-run of each with some breakdowns:

                                    ---  no-renames  ---
                              merge-recursive   merge-ort
    overall runtime:              19.302 s        14.257 s
    inexact rename detection:      7.603 s         7.906 s
    everything else:              11.699 s         6.351 s

                                    --- mega-renames ---
                              merge-recursive   merge-ort
    overall runtime:            5950.195 s      5499.672 s
    inexact rename detection:   5746.309 s      5487.120 s
    everything else:             203.886 s        17.552 s

                                    --- just-one-mega ---
                              merge-recursive   merge-ort
    overall runtime:             151.001 s       158.582 s
    inexact rename detection:    143.448 s       157.835 s
    everything else:               7.553 s         0.747 s

=== Timing observations ===

0) Maximum speedup

The "everything else" row represents the maximum speedup we could
achieve if we were to somehow infinitely parallelize inexact rename
detection, but leave everything else alone.  The fact that this is so
much smaller than the real runtime (even in the case with virtually no
renames) makes it clear just how overwhelmingly large the time spent on
rename detection can be.

1) no-renames

1a) merge-ort is faster than merge-recursive, which is nice.  However,
this still should not be considered good enough.  Although the "merge"
backend to rebase (merge-recursive) is sometimes faster than the "apply"
backend, this is one of those cases where it is not.  In fact, even
merge-ort is slower.  The "apply" backend can complete this testcase in
    6.940 s ± 0.485 s
which is about 2x faster than merge-ort and 3x faster than
merge-recursive.  One goal of the merge-ort performance work will be to
make it faster than git-am on this (and similar) testcases.

2) mega-renames

2a) Obviously rename detection is a huge cost; it's where most the time
is spent.  We need to cut that down.  If we could somehow infinitely
parallelize it and drive its time to 0, the merge-recursive time would
drop to about 204s, and the merge-ort time would drop to about 17s.  I
think this particular stat shows I've subtly baked a couple performance
improvements into merge-ort and into fast-rebase already.

3) just-one-mega

3a) not much to say here, it just gives some flavor for how rebasing
only one patch compares to rebasing 35.

=== Goals ===

This patch is obviously just the beginning.  Here are some of my goals
that this measurement will help us achieve:

* Drive the cost of rename detection down considerably for merges
* After the above has been achieved, see if there are other slowness
  factors (which would have previously been overshadowed by rename
  detection costs) which we can then focus on and also optimize.
* Ensure our rebase testcase that requires little rename detection
  is noticeably faster with merge-ort than with apply-based rebase.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Taylor Blau <ttaylorr@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 23:30:06 -08:00
5ced7c3da0 merge-ort: ignore the directory rename split conflict for now
get_provisional_directory_renames() has code to detect directories being
evenly split between different locations.  However, as noted previously,
if there are no new files added to that directory that was split evenly,
our inability to determine where the directory was renamed to doesn't
matter since there are no new files to try to move into the new
location.  Unfortunately, that code is unaware of whether there are new
files under the directory in question and we just ignore that, causing
us to fail t6423 test 2b but pass test 2a; turn off the error for now,
swapping which tests pass and fail.

The motivating reason for switching this off as a temporary measure is
that as we add optimizations, we'll start looking at only subsets of
renames, and subsets of renames can start switching the result we get
when this error is (wrongly) on.  Once we get enough optimizations,
however, we can prevent that code from even running when there are no
new files added to the relevant directory, at which point we can revert
this commit and then both testcases 2a and 2b will pass simultaneously.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 23:30:06 -08:00
cf8937acde merge-ort: fix massive leak
When a series of merges was performed (such as for a rebase or series of
cherry-picks), only the data structures allocated by the final merge
operation were being freed.  The problem was that while picking out
pieces of merge-ort to upstream, I previously misread a certain section
of merge_start() and assumed it was associated with a later
optimization.  Include that section now, which ensures that if there was
a previous merge operation, that we clear out result->priv and then
re-use it for opt->priv, and otherwise we allocate opt->priv.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 23:30:06 -08:00
7599730b7e Remove support for v1 of the PCRE library
Remove support for using version 1 of the PCRE library. Its use has
been discouraged by upstream for a long time, and it's in a
bugfix-only state.

Anyone who was relying on v1 in particular got a nudge to move to v2
in e6c531b808 (Makefile: make USE_LIBPCRE=YesPlease mean v2, not v1,
2018-03-11), which was first released as part of v2.18.0.

With this the LIBPCRE2 test prerequisites is redundant to PCRE. But
I'm keeping it for self-documentation purposes, and to avoid conflict
with other in-flight PCRE patches.

I'm also not changing all of our own "pcre2" names to "pcre", i.e. the
inverse of 6d4b5747f0 (grep: change internal *pcre* variable &
function names to be *pcre1*, 2017-05-25). I don't see the point, and
it makes the history/blame harder to read. Maybe if there's ever a
PCRE v3...

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 21:15:43 -08:00
0205bb13d0 config.mak.uname: remove redundant NO_LIBPCRE1_JIT flag
Remove a flag added in my fb95e2e38d (grep: un-break building with
PCRE >= 8.32 without --enable-jit, 2017-06-01). It's set just below
USE_LIBPCRE=YesPlease, so it's been redundant since
e6c531b808 (Makefile: make USE_LIBPCRE=YesPlease mean v2, not v1,
2018-03-11).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 21:15:12 -08:00
19a0acc83e t1092: test interesting sparse-checkout scenarios
These also document some behaviors that differ from a full checkout, and
possibly in a way that is not intended.

The test is designed to be run with "--run=1,X" where 'X' is an
interesting test case. Each test uses 'init_repos' to reset the full and
sparse copies of the initial-repo that is created by the first test
case. This also makes it possible to have test cases leave the working
directory or index in unusual states without disturbing later cases.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:20 -08:00
3b14436364 test-lib: test_region looks for trace2 regions
From ff15d509b89edd4830d85d53cea3079a6b0c1c08 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Mon, 11 Jan 2021 08:53:09 -0500
Subject: [PATCH 8/9] test-lib: test_region looks for trace2 regions

Most test cases can verify Git's behavior using input/output
expectations or changes to the .git directory. However, sometimes we
want to check that Git did or did not run a certain section of code.
This is particularly important for performance-only features that we
want to ensure have been enabled in certain cases.

Add a new 'test_region' function that checks if a trace2 region was
entered and left in a given trace2 event log.

There is one existing test (t0500-progress-display.sh) that performs
this check already, so use the helper function instead. Note that this
changes the expectations slightly. The old test (incorrectly) used two
patterns for the 'grep' invocation, but this performs an OR of the
patterns, not an AND. This means that as long as one region_enter event
was logged, the test would succeed, even if it was not due to the
progress category.

More uses will be added in a later change.

t6423-merge-rename-directories.sh also greps for region_enter lines, but
it verifies the number of such lines, which is not the same as an
existence check.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:18 -08:00
dd23022acb sparse-checkout: load sparse-checkout patterns
A future feature will want to load the sparse-checkout patterns into a
pattern_list, but the current mechanism to do so is a bit complicated.
This is made difficult due to needing to find the sparse-checkout file
in different ways throughout the codebase.

The logic implemented in the new get_sparse_checkout_patterns() was
duplicated in populate_from_existing_patterns() in unpack-trees.c. Use
the new method instead, keeping the logic around handling the struct
unpack_trees_options.

The callers to get_sparse_checkout_filename() in
builtin/sparse-checkout.c manipulate the sparse-checkout file directly,
so it is not appropriate to replace logic in that file with
get_sparse_checkout_patterns().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:07 -08:00
6a9372f4ef name-hash: use trace2 regions for init
The lazy_init_name_hash() populates a hashset with all filenames and
another with all directories represented in the index. This is run only
if we need to use the hashsets to check for existence or case-folding
renames.

Place trace2 regions where there is already a performance trace.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:07 -08:00
1fd9ae517c repository: add repo reference to index_state
It will be helpful to add behavior to index operations that might
trigger an object lookup. Since each index belongs to a specific
repository, add a 'repo' pointer to struct index_state that allows
access to this repository.

Add a BUG() statement if the repo already has an index, and the index
already has a repo, but somehow the index points to a different repo.

This will prevent future changes from needing to pass an additional
'struct repository *repo' parameter and instead rely only on the 'struct
index_state *istate' parameter.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:07 -08:00
cae70acf24 fsmonitor: de-duplicate BUG()s around dirty bits
The index has an fsmonitor_dirty bitmap that records which index entries
are "dirty" based on the response from the FSMonitor. If this bitmap
ever grows larger than the index, then there was an error in how it was
constructed, and it was probably a developer's bug.

There are several BUG() statements that are very similar, so replace
these uses with a simpler assert_index_minimum(). Since there is one
caller that uses a custom 'pos' value instead of the bit_size member, we
cannot simplify it too much. However, the error string is identical in
each, so this simplifies things.

Be sure to add one when checking if a position if valid, since the
minimum is a bound on the expected size.

The end result is that the code is simpler to read while also preserving
these assertions for developers in the FSMonitor space.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:07 -08:00
c80dd3967f cache-tree: extract subtree_pos()
This method will be helpful to use outside of cache-tree.c in a later
feature. The implementation is subtle due to subtree_name_cmp() sorting
by length and then lexicographically.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:07 -08:00
8d87e338e1 cache-tree: simplify verify_cache() prototype
The verify_cache() method takes an array of cache entries and a count,
but these are always provided directly from a struct index_state. Use
a pointer to the full structure instead.

There is a subtle point when istate->cache_nr is zero that subtracting
one will underflow. This triggers a failure in t0000-basic.sh, among
others. Use "i + 1 < istate->cache_nr" to avoid these strange
comparisons. Convert i to be unsigned as well, which also removes the
potential signed overflow in the unlikely case that cache_nr is over 2.1
billion entries. The 'funny' variable has a maximum value of 11, so
making it unsigned does not change anything of importance.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:07 -08:00
fb0882648e cache-tree: clean up cache_tree_update()
Make the method safer by allocating a cache_tree member for the given
index_state if it is not already present. This is preferrable to a
BUG() statement or returning with an error because future callers will
want to populate an empty cache-tree using this method.

Callers can also remove their conditional allocations of cache_tree.

Also drop local variables that can be found directly from the 'istate'
parameter.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:07 -08:00
db89a82b5b rm tests: actually test for SIGPIPE in SIGPIPE test
Change a test initially added in 50cd31c652 (t3600: comment on
inducing SIGPIPE in `git rm`, 2019-11-27) to explicitly test for
SIGPIPE using a pattern initially established in 7559a1be8a (unblock
and unignore SIGPIPE, 2014-09-18).

The problem with using that pattern is that it requires us to skip the
test on MINGW[1]. If we kept the test with its initial semantics[2]
we'd get coverage there, at the cost of not checking whether we
actually had SIGPIPE outside of MinGW.

Arguably we should just remove this test. Between the test added in
7559a1be8a and the change made in 12e0437f23 (common-main: call
restore_sigpipe_to_default(), 2016-07-01) it's a bit arbitrary to only
check this for "git rm".

But in lieu of having wider test coverage for other "git" subcommands
let's refactor this to explicitly test for SIGPIPE outside of MinGW,
and then just that we remove the ".git/index.lock" (as before) on all
platforms.

1. https://lore.kernel.org/git/xmqq1rec5ckf.fsf@gitster.c.googlers.com/
2. 0693f9ddad (Make sure lockfiles are unlocked when dying on SIGPIPE,
   2008-12-18)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
60127996b5 archive tests: use a cheaper "zipinfo -h" invocation to get header
Change an invocation of zipinfo added in 19ee29401d (t5004: test ZIP
archives with many entries, 2015-08-22) to simply ask zipinfo for the
header info, rather than spewing out info about the entire archive and
race to kill it with SIGPIPE due to the downstream "head -2".

I ran across this because I'm adding a "set -o pipefail" test
mode. This won't be needed for the version of the mode that I'm
introducing (which currently relies on a patch to GNU bash), but I
think this is a good idea anyway.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
9aebc4708a upload-pack tests: avoid a non-zero "grep" exit status
Continue changing a test that 763b47bafa (t5703: stop losing return
codes of git commands, 2019-11-27) already refactored.

This was originally added as part of a series to add support for
running under bash's "set -o pipefail", under that mode this test will
fail because sometimes there's no commits in the "objs" output.

It's easier to fix that than exempt these tests under a hypothetical
"set -o pipefail" test mode. It looks like we probably won't have
that, but once we've dug this code up let's refactor it[2] so we don't
hide a potential pipe failure.

1. https://lore.kernel.org/git/xmqqzh18o8o6.fsf@gitster.c.googlers.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
796c248dc1 git-svn tests: rewrite brittle tests to use "--[no-]merges".
Rewrite a brittle tests which used "rev-list" without "--[no-]merges"
to figure out if a set of commits turned into merge commits or not.

Signed-off-by: Jeff King <peff@peff.net>
[ÆAB: wrote commit message]
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
f918a89e50 git svn mergeinfo tests: refactor "test -z" to use test_must_be_empty
Refactor some old-style test code to use test_must_be_empty instead of
"test -z". This makes a follow-up commit easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
4669917e8f git svn mergeinfo tests: modernize redirection & quoting style
Use "<file" instead of "< file", and don't put the closing quote for
strings on an indented line. This makes a follow-up refactoring commit
easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
ef83970059 cache-tree tests: explicitly test HEAD and index differences
The test code added in 9c4d6c0297 (cache-tree: Write updated
cache-tree after commit, 2014-07-13) used "ls-files" in lieu of
"ls-tree" because it wanted to test the data in the index, since this
test is testing the cache-tree extension.

Change the test to instead use "ls-tree" for traversal, and then
explicitly check how HEAD differs from the index. This is more easily
understood, and less fragile as numerous past bug fixes[1][2][3] to
the old code we're replacing demonstrate.

As an aside this would be a bit easier if empty pathspecs hadn't been
made an error in d426430e6e (pathspec: warn on empty strings as
pathspec, 2016-06-22) and 9e4e8a64c2 (pathspec: die on empty strings
as pathspec, 2017-06-06).

If that was still allowed this code could be simplified slightly:

	diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
	index 9bf66c9e68..0b02881f55 100755
	--- a/t/t0090-cache-tree.sh
	+++ b/t/t0090-cache-tree.sh
	@@ -18,19 +18,18 @@ cmp_cache_tree () {
	 # test-tool dump-cache-tree already verifies that all existing data is
	 # correct.
	 generate_expected_cache_tree () {
	-       pathspec="$1" &&
	-       dir="$2${2:+/}" &&
	+       pathspec="$1${1:+/}" &&
	        git ls-tree --name-only HEAD -- "$pathspec" >files &&
	        git ls-tree --name-only -d HEAD -- "$pathspec" >subtrees &&
	-       printf "SHA %s (%d entries, %d subtrees)\n" "$dir" $(wc -l <files) $(wc -l <subtrees) &&
	+       printf "SHA %s (%d entries, %d subtrees)\n" "$pathspec" $(wc -l <files) $(wc -l <subtrees) &&
	        while read subtree
	        do
	-               generate_expected_cache_tree "$pathspec/$subtree/" "$subtree" || return 1
	+               generate_expected_cache_tree "$subtree" || return 1
	        done <subtrees
	 }

	 test_cache_tree () {
	-       generate_expected_cache_tree "." >expect &&
	+       generate_expected_cache_tree >expect &&
	        cmp_cache_tree expect &&
	        rm expect actual files subtrees &&
	        git status --porcelain -- ':!status' ':!expected.status' >status &&

1. c8db708d5d (t0090: avoid passing empty string to printf %d,
   2014-09-30)
2. d69360c6b1 (t0090: tweak awk statement for Solaris
   /usr/xpg4/bin/awk, 2014-12-22)
3. 9b5a9fa60a (t0090: stop losing return codes of git commands,
   2019-11-27)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
fa6edee776 cache-tree tests: use a sub-shell with less indirection
Change a "cd xyz && work && cd .." pattern introduced in
9c4d6c0297 (cache-tree: Write updated cache-tree after commit,
2014-07-13) to use a sub-shell instead with less indirection.

We did actually recover correctly if we failed in this function since
we were wrapped in a subshell one function call up. Let's just use the
sub-shell at the point where we want to change the directory
instead.

It's important that the "|| return 1" is outside the
subshell. Normally, we `exit 1` from within subshells[1], but that
wouldn't help us exit this loop early[1][2].

Since we can get rid of the wrapper function let's rename the main
function to drop the "rec" (for "recursion") suffix[3].

1. https://lore.kernel.org/git/CAPig+cToj8nQmyBCqC1k7DXF2vXaonCEA-fCJ4x7JBZG2ixYBw@mail.gmail.com/
2. https://lore.kernel.org/git/20150325052952.GE31924@peff.net/
3. https://lore.kernel.org/git/YARsCsgXuiXr4uFX@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
3226725507 cache-tree tests: remove unused $2 parameter
Remove the $2 paramater. This appears to have been some
work-in-progress code from an earlier version of
9c4d6c0297 (cache-tree: Write updated cache-tree after commit,
2014-07-13) which was left in the final version.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
3f96d75ef5 cache-tree tests: refactor for modern test style
Refactor the cache-tree test file to use our current recommended
patterns. This makes a subsequent meaningful change easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:11 -08:00
93a7d9835f ls-files.c: add --deduplicate option
During a merge conflict, the name of a file may appear multiple
times in "git ls-files" output, once for each stage.  If you use
both `--delete` and `--modify` at the same time, the output may
mention a deleted file twice.

When none of the '-t', '-u', or '-s' options is in use, these
duplicate entries do not add much value to the output.

Introduce a new '--deduplicate' option to suppress them.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: extended doc and rewritten commit log]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 11:48:20 -08:00
ed644d1666 ls_files.c: consolidate two for loops into one
This will make it easier to show only one entry per filename in the
next step.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: corrected the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 11:48:20 -08:00
f1c462ea41 ls_files.c: bugfix for --deleted and --modified
This situation may occur in the original code: lstat() failed
but we use `&st` to feed ie_modified() later.

Therefore, we can directly execute show_ce without the judgment of
ie_modified() when lstat() has failed.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: fixed misindented code]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 11:48:11 -08:00
b3970c702c ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets
ls-refs performs a single revision walk over the whole ref namespace,
and sends ones that match with one of the given ref prefixes down to the
user.

This can be expensive if there are many refs overall, but the portion of
them covered by the given prefixes is small by comparison.

To attempt to reduce the difference between the number of refs
traversed, and the number of refs sent, only traverse references which
are in the longest common prefix of the given prefixes. This is very
reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find
disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
multi-patterned 'git for-each-ref' invocations.

The callback 'send_ref' is resilient to ignore extra patterns by
discarding any arguments which do not begin with at least one of the
specified prefixes.

Similarly, the code introduced in b31e2680c4 is resilient to stop early
at metacharacters, but we only pass strict prefixes here. At worst we
would return too many results, but the double checking done by send_ref
will throw away anything that doesn't start with something in the prefix
list.

Finally, if no prefixes were provided, then implicitly add the empty
string (which will match all references) since this matches the existing
behavior (see the "no restrictions" comment in "ls-refs.c:ref_match()").

Original-patch-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 18:57:27 -08:00
83befd3724 ls-refs.c: initialize 'prefixes' before using it
Correctly initialize the "prefixes" strvec using strvec_init() instead
of simply zeroing it via the earlier memset().

There's no way to trigger a crash, since the first 'ref-prefix' command
will initialize the strvec via the 'ALLOC_GROW' in 'strvec_push_nodup()'
(the alloc and nr variables are already zero'd, so the call to
ALLOC_GROW is valid).

If no "ref-prefix" command was given, then the call to
'ls-refs.c:ref_match()' will abort early after it reads the zero in
'prefixes->nr'. Likewise, strvec_clear() will only call free() on the
array, which is NULL, so we're safe there, too.

But, all of this is dangerous and requires more reasoning than it would
if we simply called 'strvec_init()', so do that.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 18:57:27 -08:00
16b1985be5 refs: expose 'for_each_fullref_in_prefixes'
This function was used in the ref-filter.c code to find the longest
common prefix of among a set of refspecs, and then to iterate all of the
references that descend from that prefix.

A future patch will want to use that same code from ls-refs.c, so
prepare by exposing and moving it to refs.c. Since there is nothing
specific to the ref-filter code here (other than that it was previously
the only caller of this function), this really belongs in the more
generic refs.h header.

The code moved in this patch is identical before and after, with the one
exception of renaming some arguments to be consistent with other
functions exposed in refs.h.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 18:57:27 -08:00
be18153b97 builtin/pack-objects.c: avoid iterating all refs
In git-pack-objects, we iterate over all the tags if the --include-tag
option is passed on the command line. For some reason this uses
for_each_ref which is expensive if the repo has many refs. We should
use for_each_tag_ref instead.

Because the add_ref_tag callback will now only visit tags we
simplified it a bit.

The motivation for this change is that we observed performance issues
with a repository on gitlab.com that has 500,000 refs but only 2,000
tags. The fetch traffic on that repo is dominated by CI, and when we
changed CI to fetch with 'git fetch --no-tags' we saw a dramatic
change in the CPU profile of git-pack-objects. This lead us to this
particular ref walk. More details in:
https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/746#note_483546598

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 17:27:42 -08:00
ee4e22554f run-command: document use_shell option
It's unclear how run-command's use_shell option should impact the
arguments fed to a command. Plausibly it could mean that we glue all of
the arguments together into a string to pass to the shell, in which case
that opens the question of whether the caller needs to quote them.

But in fact we don't implement it that way (and even if we did, we'd
probably auto-quote the arguments as part of the glue step). And we must
not receive quoted arguments, because we might actually optimize out the
shell entirely (i.e., the caller does not even know if a shell will be
involved in the end or not).

Since this ambiguity may have been the cause of a recent bug, let's
document the option a bit.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 14:21:32 -08:00
822ee894f6 t5411: refactor check of refs using test_cmp_refs
Add new helper 'test_cmp_refs' to check references in a repository.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 13:09:06 -08:00
8388a64cd1 t5411: use different out file to prevent overwriting
SZEDER reported that t5411 failed in Travis CI's s390x environment a
couple of times, and could be reproduced with '--stress' test on this
specific environment.  The test failure messages might look like this:

    + test_cmp expect actual
    --- expect      2021-01-17 21:55:23.430750004 +0000
    +++ actual      2021-01-17 21:55:23.430750004 +0000
    @@ -1 +1 @@
    -<COMMIT-A> refs/heads/main
    +<COMMIT-A> refs/heads/maifatal: the remote end hung up unexpectedly
    error: last command exited with $?=1
    not ok 86 - proc-receive: not support push options (builtin protocol)

The file 'actual' is filtered from the file 'out' which contains result
of 'git show-ref' command.  Due to the error messages from other process
is written into the file 'out' accidentally, t5411 failed.  SZEDER finds
the root cause of this issue:

 - 'git push' is executed with its standard output and error redirected
   to the file 'out'.

 - 'git push' executes 'git receive-pack' internally, which inherits
   the open file descriptors, so its output and error goes into that
   same 'out' file.

 - 'git push' ends without waiting for the close of 'git-receive-pack'
   for some cases, and the file 'out' is reused for test of
   'git show-ref' afterwards.

 - A mixture of the output of 'git show-ref' abd 'git receive-pack'
   leads to this issue.

The first intuitive reaction to resolve this issue is to remove the
file 'out' after use, so that the newly created file 'out' will have a
different file descriptor and will not be overwritten by the
'git receive-pack' process.  But Johannes pointed out that removing an
open file is not possible on Windows.  So we use different temporary
file names to store the output of 'git push' to solve this issue.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 13:09:04 -08:00
8198907795 use delete_refs when deleting tags or branches
'git tag -d' accepts one or more tag refs to delete, but each deletion
is done by calling `delete_ref` on each argv. This is very slow when
removing from packed refs. Use delete_refs instead so all the removals
can be done inside a single transaction with a single update.

Do the same for 'git branch -d'.

Since delete_refs performs all the packed-refs delete operations
inside a single transaction, if any of the deletes fail then all
them will be skipped. In practice, none of them should fail since
we verify the hash of each one before calling delete_refs, but some
network error or odd permissions problem could have different results
after this change.

Also, since the file-backed deletions are not performed in the same
transaction, those could succeed even when the packed-refs transaction
fails.

After deleting branches, remove the branch config only if the branch
ref was removed and was not subsequently added back in.

A manual test deleting 24,000 tags took about 30 minutes using
delete_ref.  It takes about 5 seconds using delete_refs.

Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 16:05:05 -08:00
36a317929b refs: switch peel_ref() to peel_iterated_oid()
The peel_ref() interface is confusing and error-prone:

  - it's typically used by ref iteration callbacks that have both a
    refname and oid. But since they pass only the refname, we may load
    the ref value from the filesystem again. This is inefficient, but
    also means we are open to a race if somebody simultaneously updates
    the ref. E.g., this:

      int some_ref_cb(const char *refname, const struct object_id *oid, ...)
      {
              if (!peel_ref(refname, &peeled))
                      printf("%s peels to %s",
                             oid_to_hex(oid), oid_to_hex(&peeled);
      }

    could print nonsense. It is correct to say "refname peels to..."
    (you may see the "before" value or the "after" value, either of
    which is consistent), but mentioning both oids may be mixing
    before/after values.

    Worse, whether this is possible depends on whether the optimization
    to read from the current iterator value kicks in. So it is actually
    not possible with:

      for_each_ref(some_ref_cb);

    but it _is_ possible with:

      head_ref(some_ref_cb);

    which does not use the iterator mechanism (though in practice, HEAD
    should never peel to anything, so this may not be triggerable).

  - it must take a fully-qualified refname for the read_ref_full() code
    path to work. Yet we routinely pass it partial refnames from
    callbacks to for_each_tag_ref(), etc. This happens to work when
    iterating because there we do not call read_ref_full() at all, and
    only use the passed refname to check if it is the same as the
    iterator. But the requirements for the function parameters are quite
    unclear.

Instead of taking a refname, let's instead take an oid. That fixes both
problems. It's a little funny for a "ref" function not to involve refs
at all. The key thing is that it's optimizing under the hood based on
having access to the ref iterator. So let's change the name to make it
clear why you'd want this function versus just peel_object().

There are two other directions I considered but rejected:

  - we could pass the peel information into the each_ref_fn callback.
    However, we don't know if the caller actually wants it or not. For
    packed-refs, providing it is essentially free. But for loose refs,
    we actually have to peel the object, which would be wasteful in most
    cases. We could likewise pass in a flag to the callback indicating
    whether the peeled information is known, but that complicates those
    callbacks, as they then have to decide whether to manually peel
    themselves. Plus it requires changing the interface of every
    callback, whether they care about peeling or not, and there are many
    of them.

  - we could make a function to return the peeled value of the current
    iterated ref (computing it if necessary), and BUG() otherwise. I.e.:

      int peel_current_iterated_ref(struct object_id *out);

    Each of the current callers is an each_ref_fn callback, so they'd
    mostly be happy. But:

      - we use those callbacks with functions like head_ref(), which do
        not use the iteration code. So we'd need to handle the fallback
        case there, anyway.

      - it's possible that a caller would want to call into generic code
        that sometimes is used during iteration and sometimes not. This
        encapsulates the logic to do the fast thing when possible, and
        fallback when necessary.

The implementation is mostly obvious, but I want to call out a few
things in the patch:

  - the test-tool coverage for peel_ref() is now meaningless, as it all
    collapses to a single peel_object() call (arguably they were pretty
    uninteresting before; the tricky part of that function is the
    fast-path we see during iteration, but these calls didn't trigger
    that). I've just dropped it entirely, though note that some other
    tests relied on the tags we created; I've moved that creation to the
    tests where it matters.

  - we no longer need to take a ref_store parameter, since we'd never
    look up a ref now. We do still rely on a global "current iterator"
    variable which _could_ be kept per-ref-store. But in practice this
    is only useful if there are multiple recursive iterations, at which
    point the more appropriate solution is probably a stack of
    iterators. No caller used the actual ref-store parameter anyway
    (they all call the wrapper that passes the_repository).

  - the original only kicked in the optimization when the "refname"
    pointer matched (i.e., not string comparison). We do likewise with
    the "oid" parameter here, but fall back to doing an actual oideq()
    call. This in theory lets us kick in the optimization more often,
    though in practice no current caller cares. It should never be
    wrong, though (peeling is a property of an object, so two refs
    pointing to the same object would peel identically).

  - the original took care not to touch the peeled out-parameter unless
    we found something to put in it. But no caller cares about this, and
    anyway, it is enforced by peel_object() itself (and even in the
    optimized iterator case, that's where we eventually end up). We can
    shorten the code and avoid an extra copy by just passing the
    out-parameter through the stack.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 15:51:31 -08:00
73c01d25fe tests: remove uses of GIT_TEST_GETTEXT_POISON=false
As noted in previous commits we are removing the use of
GIT_TEST_GETTEXT_POISON=false. These tests all relied on the facility
being off, it always is off after an earlier change, but we hadn't
removed the redundant assignments to "false" in the tests.

I'm preserving the deletion of "error" lines in 38b9197a76 (t5411:
add basic test cases for proc-receive hook, 2020-08-27), it turns out
that's useful even without GIT_TEST_GETTEXT_POISON=true in
play. Update a comment added in that commit to note that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 15:50:03 -08:00
d162b25f95 tests: remove support for GIT_TEST_GETTEXT_POISON
This removes the ability to inject "poison" gettext() messages via the
GIT_TEST_GETTEXT_POISON special test setup.

I initially added this as a compile-time option in bb946bba76 (i18n:
add GETTEXT_POISON to simulate unfriendly translator, 2011-02-22), and
most recently modified to be toggleable at runtime in
6cdccfce1e (i18n: make GETTEXT_POISON a runtime option, 2018-11-08)..

The reason for its removal is that the trade-off of maintaining it
v.s. what it's getting us has long since flipped. When gettext was
integrated in 5e9637c629 (i18n: add infrastructure for translating
Git with gettext, 2011-11-18) there was understandable concern on the
Git ML that in marking messages for translation en-masse we'd
inadvertently mark plumbing messages. The GETTEXT_POISON facility was
a way to smoke those out via our test suite.

Nowadays however we're done (or almost entirely done) with any marking
of messages for translation. New messages are usually marked by their
authors, who'll know whether it makes sense to translate them or
not. If not any errors in marking the messages are much more likely to
be spotted in review than in the the initial deluge of i18n patches in
the 2011-2012 era.

So let's just remove this. This leaves the test suite in a state where
we still have a lot of test_i18n, C_LOCALE_OUTPUT
etc. uses. Subsequent commits will remove those too.

The change to t/lib-rebase.sh is a selective revert of the relevant
part of f2d17068fd (i18n: rebase-interactive: mark comments of squash
for translation, 2016-06-17), and the comment in
t/t3406-rebase-message.sh is from c7108bf9ed (i18n: rebase: mark
messages for translation, 2012-07-25).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 15:50:01 -08:00
6c280b4142 ci: remove GETTEXT_POISON jobs
A subsequent commit will remove GETTEXT_POISON entirely, let's start
by removing the CI jobs that enable the option.

We cannot just remove the job because the CI is implicitly depending
on the "poison" job being a sort of "default" job in the sense that
it's the job that was otherwise run with the default compiler, no
other GIT_TEST_* options etc. So let's keep it under the name
"linux-gcc-default".

This means we can remove the initial "make test" from the "linux-gcc"
job (it does another one after setting a bunch of GIT_TEST_*
variables).

I'm not doing that because it would conflict with the in-flight
334afbc76f (tests: mark tests relying on the current default for
`init.defaultBranch`, 2020-11-18) (currently on the "seen" branch, so
the SHA-1 will almost definitely change). It's going to use that "make
test" again for different reasons, so let's preserve it for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 15:50:00 -08:00
fe2f4d0031 Merge branch 'en/ort-directory-rename' into en/merge-ort-perf
* en/ort-directory-rename: (28 commits)
  merge-ort: fix a directory rename detection bug
  merge-ort: process_renames() now needs more defensiveness
  merge-ort: implement apply_directory_rename_modifications()
  merge-ort: add a new toplevel_dir field
  merge-ort: implement handle_path_level_conflicts()
  merge-ort: implement check_for_directory_rename()
  merge-ort: implement apply_dir_rename() and check_dir_renamed()
  merge-ort: implement compute_collisions()
  merge-ort: modify collect_renames() for directory rename handling
  merge-ort: implement handle_directory_level_conflicts()
  merge-ort: implement compute_rename_counts()
  merge-ort: copy get_renamed_dir_portion() from merge-recursive.c
  merge-ort: add outline of get_provisional_directory_renames()
  merge-ort: add outline for computing directory renames
  merge-ort: collect which directories are removed in dirs_removed
  merge-ort: initialize and free new directory rename data structures
  merge-ort: add new data structures for directory rename detection
  merge-ort: add implementation of type-changed rename handling
  merge-ort: add implementation of normal rename handling
  merge-ort: add implementation of rename collisions
  ...
2021-01-20 22:52:50 -08:00
203c872c4f merge-ort: fix a directory rename detection bug
As noted in commit 902c521a35 ("t6423: more involved directory rename
test", 2020-10-15), when we have a case where

  * dir/subdir/ has several files
  * almost all files in dir/subdir/ are renamed to folder/subdir/
  * one of the files in dir/subdir/ is renamed to folder/subdir/newsubdir/
  * the other side of history (that doesn't do the renames) adds a
    new file to dir/subdir/

Then for the majority of the file renames, the directory rename of
   dir/subdir/ -> folder/subdir/
is actually not represented that way but as
   dir/ -> folder/
We also had one rename that was represented as
   dir/subdir/ -> folder/subdir/newsubdir/

Now, since there's a new file in dir/subdir/, where does it go?  Well,
there's only one rule for dir/subdir/, so the code previously noted that
this rule had the "majority" of the one "relevant" rename and thus
erroneously used it to place the file in folder/subdir/newsubdir/.  We
really want the heavy weight associated with dir/ -> folder/ to also be
treated as dir/subdir/ -> folder/subdir/, so that we correctly place the
file in folder/subdir/.

Add a bunch of logic to make sure that we use all relevant renamings in
directory rename detection.

Note that testcase 12f of t6423 still fails after this, but it gets
further than merge-recursive does.  There are some performance related
bits in that testcase (the region_enter messages) that do not yet
succeed, but the rest of the testcase works after this patch.
Subsequent patch series will fix up the performance side.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
1b6b902d95 merge-ort: process_renames() now needs more defensiveness
Since directory rename detection adds new paths to opt->priv->paths and
removes old ones, process_renames() needs to now check whether
pair->one->path actually exists in opt->priv->paths instead of just
assuming it does.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
089d82bc18 merge-ort: implement apply_directory_rename_modifications()
This function roughly follows the same outline as the function of the
same name from merge-recursive.c, but the code diverges in multiple
ways due to some special considerations:
  * merge-ort's version needs to update opt->priv->paths with any new
    paths (and opt->priv->paths points to struct conflict_infos which
    track quite a bit of metadata for each path); merge-recursive's
    version would directly update the index
  * merge-ort requires that opt->priv->paths has any leading directories
    of any relevant files also be included in the set of paths.  And
    due to pointer equality requirements on merged_info.directory_name,
    we have to be careful how we compute and insert these.
  * due to the above requirements on opt->priv->paths, merge-ort's
    version starts with a long comment to explain all the special
    considerations that need to be handled
  * merge-ort can use the full data stored in opt->priv->paths to avoid
    making expensive get_tree_entry() calls to regather the necessary
    data.
  * due to messages being deferred automatically in merge-ort, this is
    the best place to handle conflict messages whereas in
    merge-recursive.c they are deferred manually so that processing of
    entries does all the printing

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
05b85c6eeb merge-ort: add a new toplevel_dir field
Due to the string-equality-iff-pointer-equality requirements placed on
merged_info.directory_name, apply_directory_rename_modifications() will
need to have access to the exact toplevel directory name string pointer
and can't just use a new empty string.  Store it in a field that we can
use.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
bea433655a merge-ort: implement handle_path_level_conflicts()
This is copied from merge-recursive.c, with minor tweaks due to:
  * using strmap API
  * merge-ort not using the non_unique_new_dir field, since it'll
    obviate its need entirely later with performance improvements
  * adding a new path_in_way() function that uses opt->priv->paths
    instead of doing an expensive tree_has_path() lookup to see if
    a tree has a given path.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
47325e8533 merge-ort: implement check_for_directory_rename()
This is copied from merge-recursive.c, with minor tweaks due to using strmap
API and the fact that it can use opt->priv->paths to get all pathnames that
exist instead of taking a tree object.

This depends on a new function, handle_path_level_conflicts(), which
just has a placeholder die-not-yet-implemented implementation for now; a
subsequent patch will implement it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
fbcfc0cc17 merge-ort: implement apply_dir_rename() and check_dir_renamed()
Both of these are copied from merge-recursive.c, with just minor tweaks
due to using strmap API and not having a non_unique_new_dir field.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
d9d015df4a merge-ort: implement compute_collisions()
This is nearly a wholesale copy of compute_collisions() from
merge-recursive.c, and the logic remains the same, but it has been
tweaked slightly due to:

  * using strmap.h API (instead of direct hashmaps)
  * allocation/freeing of data structures were done separately in
    merge_start() and clear_or_reinit_internal_opts() in an earlier
    patch in this series
  * there is no non_unique_new_dir data field in merge-ort; that will
    be handled a different way

It does depend on two new functions, apply_dir_rename() and
check_dir_renamed() which were introduced with simple
die-not-yet-implemented shells and will be implemented in subsequent
patches.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
fa5e06d690 merge-ort: modify collect_renames() for directory rename handling
collect_renames() is similar to merge-recursive.c's get_renames(), but
lacks the directory rename handling found in the latter.  Port that code
structure over to merge-ort.  This introduces three new
die-not-yet-implemented functions that will be defined in future
commits.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
98d0d08128 merge-ort: implement handle_directory_level_conflicts()
This is modelled on the version of handle_directory_level_conflicts()
from merge-recursive.c, but is massively simplified due to the following
factors:
  * strmap API provides simplifications over using direct hashmap
  * we have a dirs_removed field in struct rename_info that we have an
    easy way to populate from collect_merge_info(); this was already
    used in compute_rename_counts() and thus we do not need to check
    for condition #2.
  * The removal of condition #2 by handling it earlier in the code also
    obviates the need to check for condition #3 -- if both sides renamed
    a directory, meaning that the directory no longer exists on either
    side, then neither side could have added any new files to that
    directory, and thus there are no files whose locations we need to
    move due to such a directory rename.

In fact, the same logic that makes condition #3 irrelevant means
condition #1 is also irrelevant so we could drop this function.
However, it is cheap to check if both sides rename the same directory,
and doing so can save future computation.  So, simply remove any
directories that both sides renamed from the list of directory renames.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
2f620a4f19 merge-ort: implement compute_rename_counts()
This function is based on the first half of get_directory_renames() from
merge-recursive.c; as part of the implementation, factor out a routine,
increment_count(), to update the bookkeeping to track the number of
items renamed into new directories.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
9fe37e7bb9 merge-ort: copy get_renamed_dir_portion() from merge-recursive.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
04264d4079 merge-ort: add outline of get_provisional_directory_renames()
This function is based on merge-recursive.c's get_directory_renames(),
except that the first half has been split out into a not-yet-implemented
compute_rename_counts().  The primary difference here is our lack of the
non_unique_new_dir boolean in our strmap.  The lack of that field will
at first cause us to fail testcase 2b of t6423; however, future
optimizations will obviate the need for that ugly field so we have just
left it out.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
112e11126b merge-ort: add outline for computing directory renames
Port some directory rename handling changes from merge-recursive.c's
detect_and_process_renames() to the same-named function of merge-ort.c.
This does not yet add any use or handling of directory renames, just the
outline for where we start to compute them.  Thus, a future patch will
add port additional changes to merge-ort's detect_and_process_renames().

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 22:18:55 -08:00
3cf5f221be t7900: clean up some broken refs
The tests for the 'prefetch' task create remotes and fetch refs into
'refs/prefetch/<remote>/' and tags into 'refs/tags/'. These tests use
the remotes to create objects not intended to be seen by the "local"
repository.

In that sense, the incrmental-repack tasks did not have these objects
and refs in mind. That test replaces the object directory with a
specific pack-file layout for testing the batch-size logic. However,
this causes some operations to start showing warnings such as:

 error: refs/prefetch/remote1/one does not point to a valid object!
 error: refs/tags/one does not point to a valid object!

This only shows up if you run the tests verbosely and watch the output.
It caught my eye and I _thought_ that there was a bug where 'git gc' or
'git repack' wouldn't check 'refs/prefetch/' before pruning objects.
That is incorrect. Those commands do handle 'refs/prefetch/' correctly.

All that is left is to clean up the tests in t7900-maintenance.sh to
remove these tags and refs that are not being repacked for the
incremental-repack tests. Use update-ref to ensure this works with all
ref backends.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 18:46:22 -08:00
96eaffebbf maintenance: set log.excludeDecoration durin prefetch
The 'prefetch' task fetches refs from all remotes and places them in the
refs/prefetch/<remote>/ refspace. As this task is intended to run in the
background, this allows users to keep their local data very close to the
remote servers' data while not updating the users' understanding of the
remote refs in refs/remotes/<remote>/.

However, this can clutter 'git log' decorations with copies of the refs
with the full name 'refs/prefetch/<remote>/<branch>'.

The log.excludeDecoration config option was added in a6be5e67 (log: add
log.excludeDecoration config option, 2020-05-16) for exactly this
purpose.

Ensure we set this only for users that would benefit from it by
assigning it at the beginning of the prefetch task. Other alternatives
would be during 'git maintenance register' or 'git maintenance start',
but those might assign the config even when the prefetch task is
disabled by existing config. Further, users could run 'git maintenance
run --task=prefetch' using their own scripting or scheduling. This
provides the best coverage to automatically update the config when
valuable.

It is improbable, but possible, that users might want to run the
prefetch task _and_ see these refs in their log decorations. This seems
incredibly unlikely to me, but users can always opt-in on a
command-by-command basis using --decorate-refs=refs/prefetch/.

Test that this works in a few cases. In particular, ensure that our
assignment of log.excludeDecoration=refs/prefetch/ is additive to other
existing exclusions. Further, ensure we do not add multiple copies in
multiple runs.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 18:46:22 -08:00
498bb5b82e sequencer: factor out code to append squash message
This code is going to grow over the next two commits so move it to
its own function.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 17:50:11 -08:00
eab0df0e5b rebase -i: only write fixup-message when it's needed
The file "$GIT_DIR/rebase-merge/fixup-message" is only used for fixup
commands, there's no point in writing it for squash commands as it is
immediately deleted.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 17:50:11 -08:00
1fb5cf0da6 commit: ignore additional signatures when parsing signed commits
When we create a commit with multiple signatures, neither of these
signatures includes the other.  Consequently, when we produce the
payload which has been signed so we can verify the commit, we must strip
off any other signatures, or the payload will differ from what was
signed.  Do so, and in preparation for verifying with multiple
algorithms, pass the algorithm we want to verify into
parse_signed_commit.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 17:38:20 -08:00
83dff3eb2e ref-filter: switch some uses of unsigned long to size_t
In the future, we'll want to pass some of the arguments of find_subpos
to strbuf_detach, which takes a size_t.  This is fine on systems where
that's the same size as unsigned long, but that isn't the case on all
systems.  Moreover, size_t makes sense since it's not possible to use a
buffer here that's larger than memory anyway.

Let's switch each use to size_t for these lengths in
grab_sub_body_contents and find_subpos.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 17:38:19 -08:00
5a3b130cad doc: add corrected commit date info
With generation data chunk and corrected commit dates implemented, let's
update the technical documentation for commit-graph.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
8d00d7c3df commit-reach: use corrected commit dates in paint_down_to_common()
091f4cf (commit: don't use generation numbers if not needed,
2018-08-30) changed paint_down_to_common() to use commit dates instead
of generation numbers v1 (topological levels) as the performance
regressed on certain topologies. With generation number v2 (corrected
commit dates) implemented, we no longer have to rely on commit dates and
can use generation numbers.

For example, the command `git merge-base v4.8 v4.9` on the Linux
repository walks 167468 commits, taking 0.135s for committer date and
167496 commits, taking 0.157s for corrected committer date respectively.

While using corrected commit dates, Git walks nearly the same number of
commits as commit date, the process is slower as for each comparision we
have to access a commit-slab (for corrected committer date) instead of
accessing struct member (for committer date).

This change incidentally broke the fragile t6404-recursive-merge test.
t6404-recursive-merge sets up a unique repository where all commits have
the same committer date without a well-defined merge-base.

While running tests with GIT_TEST_COMMIT_GRAPH unset, we use committer
date as a heuristic in paint_down_to_common(). 6404.1 'combined merge
conflicts' merges commits in the order:
- Merge C with B to form an intermediate commit.
- Merge the intermediate commit with A.

With GIT_TEST_COMMIT_GRAPH=1, we write a commit-graph and subsequently
use the corrected committer date, which changes the order in which
commits are merged:
- Merge A with B to form an intermediate commit.
- Merge the intermediate commit with C.

While resulting repositories are equivalent, 6404.4 'virtual trees were
processed' fails with GIT_TEST_COMMIT_GRAPH=1 as we are selecting
different merge-bases and thus have different object ids for the
intermediate commits.

As this has already causes problems (as noted in 859fdc0 (commit-graph:
define GIT_TEST_COMMIT_GRAPH, 2018-08-29)), we disable commit graph
within t6404-recursive-merge.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
1fdc383c5c commit-graph: use generation v2 only if entire chain does
Since there are released versions of Git that understand generation
numbers in the commit-graph's CDAT chunk but do not understand the GDAT
chunk, the following scenario is possible:

1. "New" Git writes a commit-graph with the GDAT chunk.
2. "Old" Git writes a split commit-graph on top without a GDAT chunk.

If each layer of split commit-graph is treated independently, as it was
the case before this commit, with Git inspecting only the current layer
for chunk_generation_data pointer, commits in the lower layer (one with
GDAT) whould have corrected commit date as their generation number,
while commits in the upper layer would have topological levels as their
generation. Corrected commit dates usually have much larger values than
topological levels. This means that if we take two commits, one from the
upper layer, and one reachable from it in the lower layer, then the
expectation that the generation of a parent is smaller than the
generation of a child would be violated.

It is difficult to expose this issue in a test. Since we _start_ with
artificially low generation numbers, any commit walk that prioritizes
generation numbers will walk all of the commits with high generation
number before walking the commits with low generation number. In all the
cases I tried, the commit-graph layers themselves "protect" any
incorrect behavior since none of the commits in the lower layer can
reach the commits in the upper layer.

This issue would manifest itself as a performance problem in this case,
especially with something like "git log --graph" since the low
generation numbers would cause the in-degree queue to walk all of the
commits in the lower layer before allowing the topo-order queue to write
anything to output (depending on the size of the upper layer).

Therefore, When writing the new layer in split commit-graph, we write a
GDAT chunk only if the topmost layer has a GDAT chunk. This guarantees
that if a layer has GDAT chunk, all lower layers must have a GDAT chunk
as well.

Rewriting layers follows similar approach: if the topmost layer below
the set of layers being rewritten (in the split commit-graph chain)
exists, and it does not contain GDAT chunk, then the result of rewrite
does not have GDAT chunks either.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
e8b63005c4 commit-graph: implement generation data chunk
As discovered by Ævar, we cannot increment graph version to
distinguish between generation numbers v1 and v2 [1]. Thus, one of
pre-requistes before implementing generation number v2 was to
distinguish between graph versions in a backwards compatible manner.

We are going to introduce a new chunk called Generation DATa chunk (or
GDAT). GDAT will store corrected committer date offsets whereas CDAT
will still store topological level.

Old Git does not understand GDAT chunk and would ignore it, reading
topological levels from CDAT. New Git can parse GDAT and take advantage
of newer generation numbers, falling back to topological levels when
GDAT chunk is missing (as it would happen with a commit-graph written
by old Git).

We introduce a test environment variable 'GIT_TEST_COMMIT_GRAPH_NO_GDAT'
which forces commit-graph file to be written without generation data
chunk to emulate a commit-graph file written by old Git.

To minimize the space required to store corrrected commit date, Git
stores corrected commit date offsets into the commit-graph file, instea
of corrected commit dates. This saves us 4 bytes per commit, decreasing
the GDAT chunk size by half, but it's possible for the offset to
overflow the 4-bytes allocated for storage. As such overflows are and
should be exceedingly rare, we use the following overflow management
scheme:

We introduce a new commit-graph chunk, Generation Data OVerflow ('GDOV')
to store corrected commit dates for commits with offsets greater than
GENERATION_NUMBER_V2_OFFSET_MAX.

If the offset is greater than GENERATION_NUMBER_V2_OFFSET_MAX, we set
the MSB of the offset and the other bits store the position of corrected
commit date in GDOV chunk, similar to how Extra Edge List is maintained.

We test the overflow-related code with the following repo history:

           F - N - U
          /         \
U - N - U            N
         \          /
	  N - F - N

Where the commits denoted by U have committer date of zero seconds
since Unix epoch, the commits denoted by N have committer date of
1112354055 (default committer date for the test suite) seconds since
Unix epoch and the commits denoted by F have committer date of
(2 ^ 31 - 2) seconds since Unix epoch.

The largest offset observed is 2 ^ 31, just large enough to overflow.

[1]: https://lore.kernel.org/git/87a7gdspo4.fsf@evledraar.gmail.com/

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
c1a09119f6 commit-graph: implement corrected commit date
With most of preparations done, let's implement corrected commit date.

The corrected commit date for a commit is defined as:

* A commit with no parents (a root commit) has corrected commit date
  equal to its committer date.
* A commit with at least one parent has corrected commit date equal to
  the maximum of its commit date and one more than the largest corrected
  commit date among its parents.

As a special case, a root commit with timestamp of zero (01.01.1970
00:00:00Z) has corrected commit date of one, to be able to distinguish
from GENERATION_NUMBER_ZERO (that is, an uncomputed corrected commit
date).

To minimize the space required to store corrected commit date, Git
stores corrected commit date offsets into the commit-graph file. The
corrected commit date offset for a commit is defined as the difference
between its corrected commit date and actual commit date.

Storing corrected commit date requires sizeof(timestamp_t) bytes, which
in most cases is 64 bits (uintmax_t). However, corrected commit date
offsets can be safely stored using only 32-bits. This halves the size
of GDAT chunk, which is a reduction of around 6% in the size of
commit-graph file.

However, using offsets be problematic if a commit is malformed but valid
and has committer date of 0 Unix time, as the offset would be the same
as corrected commit date and thus require 64-bits to be stored properly.

While Git does not write out offsets at this stage, Git stores the
corrected commit dates in member generation of struct commit_graph_data.
It will begin writing commit date offsets with the introduction of
generation data chunk.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
d7f92784c6 commit-graph: return 64-bit generation number
In a preparatory step for introducing corrected commit dates, let's
return timestamp_t values from commit_graph_generation(), use
timestamp_t for local variables and define GENERATION_NUMBER_INFINITY
as (2 ^ 63 - 1) instead.

We rename GENERATION_NUMBER_MAX to GENERATION_NUMBER_V1_MAX to
represent the largest topological level we can store in the commit data
chunk.

With corrected commit dates implemented, we will have two such *_MAX
variables to denote the largest offset and largest topological level
that can be stored.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
72a2bfcaf0 commit-graph: add a slab to store topological levels
In a later commit we will introduce corrected commit date as the
generation number v2. Corrected commit dates will be stored in the new
seperate Generation Data chunk. However, to ensure backwards
compatibility with "Old" Git we need to continue to write generation
number v1 (topological levels) to the commit data chunk. Thus, we need
to compute and store both versions of generation numbers to write the
commit-graph file.

Therefore, let's introduce a commit-slab `topo_level_slab` to store
topological levels; corrected commit date will be stored in the member
`generation` of struct commit_graph_data.

The macros `GENERATION_NUMBER_INFINITY` and `GENERATION_NUMBER_ZERO`
mark commits not in the commit-graph file and commits written by a
version of Git that did not compute generation numbers respectively.
Generation numbers are computed identically for both kinds of commits.

A "slab-miss" should return `GENERATION_NUMBER_INFINITY` as the commit
is not in the commit-graph file. However, since the slab is
zero-initialized, it returns 0 (or rather `GENERATION_NUMBER_ZERO`).
Thus, we no longer need to check if the topological level of a commit is
`GENERATION_NUMBER_INFINITY`.

We will add a pointer to the slab in `struct write_commit_graph_context`
and `struct commit_graph` to populate the slab in
`fill_commit_graph_info` if the commit has a pre-computed topological
level as in case of split commit-graphs.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
c0ef139843 t6600-test-reach: generalize *_three_modes
In a preparatory step to implement generation number v2, we add tests to
ensure Git can read and parse commit-graph files without Generation Data
chunk. These files represent commit-graph files written by Old Git and
are neccesary for backward compatability.

We extend run_three_modes() and test_three_modes() to *_all_modes() with
the fourth mode being "commit-graph without generation data chunk".

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
f90fca638e commit-graph: consolidate fill_commit_graph_info
Both fill_commit_graph_info() and fill_commit_in_graph() parse
information present in commit data chunk. Let's simplify the
implementation by calling fill_commit_graph_info() within
fill_commit_in_graph().

fill_commit_graph_info() used to not load committer data from commit data
chunk. However, with the upcoming switch to using corrected committer
date as generation number v2, we will have to load committer date to
compute generation number value anyway.

e51217e15 (t5000: test tar files that overflow ustar headers,
30-06-2016) introduced a test 'generate tar with future mtime' that
creates a commit with committer date of (2^36 + 1) seconds since
EPOCH. The CDAT chunk provides 34-bits for storing committer date, thus
committer time overflows into generation number (within CDAT chunk) and
has undefined behavior.

The test used to pass as fill_commit_graph_info() would not set struct
member `date` of struct commit and load committer date from the object
database, generating a tar file with the expected mtime.

However, with corrected commit date, we will load the committer date
from CDAT chunk (truncated to lower 34-bits to populate the generation
number. Thus, Git sets date and generates tar file with the truncated
mtime.

The ustar format (the header format used by most modern tar programs)
only has room for 11 (or 12, depending on some implementations) octal
digits for the size and mtime of each file.

As the CDAT chunk is overflow by 12-octal digits but not 11-octal
digits, we split the existing tests to test both implementations
separately and add a new explicit test for 11-digit implementation.

To test the 11-octal digit implementation, we create a future commit
with committer date of 2^34 - 1, which overflows 11-octal digits without
overflowing 34-bits of the Commit Date chunks.

To test the 12-octal digit implementation, the smallest committer date
possible is 2^36 + 1, which overflows the CDAT chunk and thus
commit-graph must be disabled for the test.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
2f9bbb6d91 revision: parse parent in indegree_walk_step()
In indegree_walk_step(), we add unvisited parents to the indegree queue.
However, parents are not guaranteed to be parsed. As the indegree queue
sorts by generation number, let's parse parents before inserting them to
ensure the correct priority order.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
e30c5ee76c commit-graph: fix regression when computing Bloom filters
Before computing Bloom filters, the commit-graph machinery uses
commit_gen_cmp to sort commits by generation order for improved diff
performance. 3d11275505 (commit-graph: examine commits by generation
number, 2020-03-30) claims that this sort can reduce the time spent to
compute Bloom filters by nearly half.

But since c49c82aa4c (commit: move members graph_pos, generation to a
slab, 2020-06-17), this optimization is broken, since asking for a
'commit_graph_generation()' directly returns GENERATION_NUMBER_INFINITY
while writing.

Not all hope is lost, though: 'commit_gen_cmp()' falls back to
comparing commits by their date when they have equal generation number,
and so since c49c82aa4c is purely a date comparison function. This
heuristic is good enough that we don't seem to loose appreciable
performance while computing Bloom filters.

Applying this patch (compared with v2.30.0) speeds up computing Bloom
filters by factors ranging from 0.40% to 5.19% on various repositories [1].

So, avoid the useless 'commit_graph_generation()' while writing by
instead accessing the slab directly. This returns the newly-computed
generation numbers, and allows us to avoid the heuristic by directly
comparing generation numbers.

[1]: https://lore.kernel.org/git/20210105094535.GN8396@szeder.dev/

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:17 -08:00
a4b6d202ca cache-tree: speed up consecutive path comparisons
The previous change reduced time spent in strlen() while comparing
consecutive paths in verify_cache(), but we can do better. The
conditional checks the existence of a directory separator at the correct
location, but only after doing a string comparison. Swap the order to be
logically equivalent but perform fewer string comparisons.

To test the effect on performance, I used a repository with over three
million paths in the index. I then ran the following command on repeat:

  git -c index.threads=1 commit --amend --allow-empty --no-edit

Here are the measurements over 10 runs after a 5-run warmup:

  Benchmark #1: v2.30.0
    Time (mean ± σ):     854.5 ms ±  18.2 ms
    Range (min … max):   825.0 ms … 892.8 ms

  Benchmark #2: Previous change
    Time (mean ± σ):     833.2 ms ±  10.3 ms
    Range (min … max):   815.8 ms … 849.7 ms

  Benchmark #3: This change
    Time (mean ± σ):     815.5 ms ±  18.1 ms
    Range (min … max):   795.4 ms … 849.5 ms

This change is 2% faster than the previous change and 5% faster than
v2.30.0.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 23:05:13 -08:00
0b72536a0b cache-tree: use ce_namelen() instead of strlen()
Use the name length field of cache entries instead of calculating its
value anew.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 23:05:13 -08:00
4bdde337f4 index-format: discuss recursion of cache-tree better
The end of the cache tree index extension format trails off with
ellipses ever since 23fcc98 (doc: technical details about the index
file format, 2011-03-01). While an intuitive reader could gather what
this means, it could be better to use "and so on" instead.

Really, this is only justified because I also wanted to point out that
the number of subtrees in the index format is used to determine when the
recursive depth-first-search stack should be "popped." This should help
to add clarity to the format.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 23:04:59 -08:00
22ad8600c1 index-format: update preamble to cache tree extension
I had difficulty in my efforts to learn about the cache tree extension
based on the documentation and code because I had an incorrect
assumption about how it behaved. This might be due to some ambiguity in
the documentation, so this change modifies the beginning of the cache
tree format by expanding the description of the feature.

My hope is that this documentation clarifies a few things:

1. There is an in-memory recursive tree structure that is constructed
   from the extension data. This structure has a few differences, such
   as where the name is stored.

2. What does it mean for an entry to be invalid?

3. When exactly are "new" trees created?

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 23:04:46 -08:00
845d15d4d0 index-format: use 'cache tree' over 'cached tree'
The index has a "cache tree" extension. This corresponds to a
significant API implemented in cache-tree.[ch]. However, there are a few
places that refer to this erroneously as "cached tree". These are rare,
but notably the index-format.txt file itself makes this error.

The only other reference is in t7104-reset-hard.sh.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 23:04:38 -08:00
0e5c950267 cache-tree: trace regions for prime_cache_tree
Commands such as "git reset --hard" rebuild the in-memory representation
of the cache tree index extension by parsing tree objects starting at a
known root tree. The performance of this operation can vary widely
depending on the width and depth of the repository's working directory
structure. Measure the time in this operation using trace2 regions in
prime_cache_tree().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 23:04:32 -08:00
4c3e18723c cache-tree: trace regions for I/O
As we write or read the cache tree index extension, it can be good to
isolate how much of the file I/O time is spent constructing this
in-memory tree from the existing index or writing it out again to the
new index file. Use trace2 regions to indicate that we are spending time
on this operation.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 23:04:21 -08:00
66e871b664 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 21:48:47 -08:00
49656f9445 Merge branch 'jc/macos-install-dependencies-fix'
Fix for procedure to building CI test environment for mac.

* jc/macos-install-dependencies-fix:
  ci/install-depends: attempt to fix "brew cask" stuff
2021-01-15 21:48:47 -08:00
8782bfbf01 Merge branch 'tb/local-clone-race-doc'
Doc update.

* tb/local-clone-race-doc:
  Documentation/git-clone.txt: document race with --local
2021-01-15 21:48:47 -08:00
644d85e751 Merge branch 'bc/doc-status-short'
Doc update.

* bc/doc-status-short:
  docs: rephrase and clarify the git status --short format
2021-01-15 21:48:47 -08:00
453e149c8a Merge branch 'dl/p4-encode-after-kw-expansion'
Text encoding fix for "git p4".

* dl/p4-encode-after-kw-expansion:
  git-p4: fix syncing file types with pattern
2021-01-15 21:48:47 -08:00
cf2870adda Merge branch 'ab/gettext-charset-comment-fix'
Comments update.

* ab/gettext-charset-comment-fix:
  gettext.c: remove/reword a mostly-useless comment
  Makefile: remove a warning about old GETTEXT_POISON flag
2021-01-15 21:48:46 -08:00
eecc5f0775 Merge branch 'ug/doc-lose-dircache'
Doc update.

* ug/doc-lose-dircache:
  doc: remove "directory cache" from man pages
2021-01-15 21:48:46 -08:00
d9e1cd555d Merge branch 'ad/t4129-setfacl-target-fix'
Test fix.

* ad/t4129-setfacl-target-fix:
  t4129: fix setfacl-related permissions failure
2021-01-15 21:48:46 -08:00
2b8cef2307 Merge branch 'jk/t5516-deflake'
Test fix.

* jk/t5516-deflake:
  t5516: loosen "not our ref" error check
2021-01-15 21:48:46 -08:00
788f488b33 Merge branch 'vv/send-email-with-less-secure-apps-access'
Doc update.

* vv/send-email-with-less-secure-apps-access:
  git-send-email.txt: mention less secure app access with Gmail
2021-01-15 21:48:46 -08:00
073552d7ae Merge branch 'pb/mergetool-tool-help-fix'
Fix 2.29 regression where "git mergetool --tool-help" fails to list
all the available tools.

* pb/mergetool-tool-help-fix:
  mergetool--lib: fix '--tool-help' to correctly show available tools
2021-01-15 21:48:46 -08:00
aa08688362 Merge branch 'ds/for-each-repo-noopfix'
"git for-each-repo --config=<var> <cmd>" should not run <cmd> for
any repository when the configuration variable <var> is not defined
even once.

* ds/for-each-repo-noopfix:
  for-each-repo: do nothing on empty config
2021-01-15 21:48:46 -08:00
6a393f36d9 Merge branch 'jc/sign-off'
Doc update.

* jc/sign-off:
  SubmittingPatches: tighten wording on "sign-off" procedure
2021-01-15 21:48:45 -08:00
8dbabb31df Merge branch 'mt/t4129-with-setgid-dir'
Some tests expect that "ls -l" output has either '-' or 'x' for
group executable bit, but setgid bit can be inherited from parent
directory and make these fields 'S' or 's' instead, causing test
failures.

* mt/t4129-with-setgid-dir:
  t4129: don't fail if setgid is set in the test directory
2021-01-15 21:48:45 -08:00
b2ace18759 Merge branch 'ds/maintenance-part-4'
Follow-up on the "maintenance part-3" which introduced scheduled
maintenance tasks to support platforms whose native scheduling
methods are not 'cron'.

* ds/maintenance-part-4:
  maintenance: use Windows scheduled tasks
  maintenance: use launchctl on macOS
  maintenance: include 'cron' details in docs
  maintenance: extract platform-specific scheduling
2021-01-15 21:48:45 -08:00
4151fdb1c7 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 15:20:30 -08:00
f9fb9063fd Merge branch 'fc/completion-aliases-support'
Bash completion (in contrib/) update to make it easier for
end-users to add completion for their custom "git" subcommands.

* fc/completion-aliases-support:
  completion: add proper public __git_complete
  test: completion: add tests for __git_complete
  completion: bash: improve function detection
  completion: bash: add __git_have_func helper
2021-01-15 15:20:30 -08:00
62fb47a4d3 Merge branch 'en/stash-apply-sparse-checkout'
"git stash" did not work well in a sparsely checked out working
tree.

* en/stash-apply-sparse-checkout:
  stash: fix stash application in sparse-checkouts
  stash: remove unnecessary process forking
  t7012: add a testcase demonstrating stash apply bugs in sparse checkouts
2021-01-15 15:20:29 -08:00
1ee70a916d Merge branch 'ar/t6016-modernise'
Test update.

* ar/t6016-modernise:
  t6016: move to lib-log-graph.sh framework
2021-01-15 15:20:29 -08:00
2ce8de6bf9 Merge branch 'zh/arg-help-format'
Clean up option descriptions in "git cmd --help".

* zh/arg-help-format:
  builtin/*: update usage format
  parse-options: format argh like error messages
2021-01-15 15:20:29 -08:00
7bfa022993 Merge branch 'nk/perf-fsmonitor-cleanup'
Test fix.

* nk/perf-fsmonitor-cleanup:
  p7519: allow running without watchman prereq
2021-01-15 15:20:29 -08:00
02feca721e Merge branch 'ds/trace2-topo-walk'
The topological walk codepath is covered by new trace2 stats.

* ds/trace2-topo-walk:
  revision: trace topo-walk statistics
2021-01-15 15:20:29 -08:00
df26861c56 Merge branch 'rs/rebase-commit-validation'
Diagnose command line error of "git rebase" early.

* rs/rebase-commit-validation:
  rebase: verify commit parameter
2021-01-15 15:20:29 -08:00
8b327f1784 Merge branch 'ma/sha1-is-a-hash'
Retire more names with "sha1" in it.

* ma/sha1-is-a-hash:
  hash-lookup: rename from sha1-lookup
  sha1-lookup: rename `sha1_pos()` as `hash_pos()`
  object-file.c: rename from sha1-file.c
  object-name.c: rename from sha1-name.c
2021-01-15 15:20:29 -08:00
16a8055dae Merge branch 'ma/doc-pack-format-varint-for-sizes'
Doc update.

* ma/doc-pack-format-varint-for-sizes:
  pack-format.txt: document sizes at start of delta data
2021-01-15 15:20:29 -08:00
a11571bb7f Merge branch 'ma/t1300-cleanup'
Code clean-up.

* ma/t1300-cleanup:
  t1300: don't needlessly work with `core.foo` configs
  t1300: remove duplicate test for `--file no-such-file`
  t1300: remove duplicate test for `--file ../foo`
2021-01-15 15:20:28 -08:00
40876260ef Merge branch 'pb/doc-modules-git-work-tree-typofix'
Doc fix.

* pb/doc-modules-git-work-tree-typofix:
  gitmodules.txt: fix 'GIT_WORK_TREE' variable name
2021-01-15 15:20:28 -08:00
b17eb5b4e4 Merge branch 'ta/doc-typofix'
Doc fix.

* ta/doc-typofix:
  doc: fix some typos
2021-01-15 15:20:28 -08:00
9ba366f12b Merge branch 'bc/rev-parse-path-format'
"git rev-parse" can be explicitly told to give output as absolute
or relative path with the `--path-format=(absolute|relative)` option.

* bc/rev-parse-path-format:
  rev-parse: add option for absolute or relative path formatting
  abspath: add a function to resolve paths with missing components
2021-01-15 15:20:28 -08:00
6dbbae17d9 Merge branch 'ew/decline-core-abbrev'
The configuration variable 'core.abbrev' can be set to 'no' to
force no abbreviation regardless of the hash algorithm.

* ew/decline-core-abbrev:
  core.abbrev=no disables abbreviations
2021-01-15 15:20:28 -08:00
d8d77153ea config: allow specifying config entries via envvar pairs
While we currently have the `GIT_CONFIG_PARAMETERS` environment variable
which can be used to pass runtime configuration data to git processes,
it's an internal implementation detail and not supposed to be used by
end users.

Next to being for internal use only, this way of passing config entries
has a major downside: the config keys need to be parsed as they contain
both key and value in a single variable. As such, it is left to the user
to escape any potentially harmful characters in the value, which is
quite hard to do if values are controlled by a third party.

This commit thus adds a new way of adding config entries via the
environment which gets rid of this shortcoming. If the user passes the
`GIT_CONFIG_COUNT=$n` environment variable, Git will parse environment
variable pairs `GIT_CONFIG_KEY_$i` and `GIT_CONFIG_VALUE_$i` for each
`i` in `[0,n)`.

While the same can be achieved with `git -c <name>=<value>`, one may
wish to not do so for potentially sensitive information. E.g. if one
wants to set `http.extraHeader` to contain an authentication token,
doing so via `-c` would trivially leak those credentials via e.g. ps(1),
which typically also shows command arguments.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 13:03:45 -08:00
b9d147fb15 environment: make getenv_safe() a public function
The `getenv_safe()` helper function helps to safely retrieve multiple
environment values without the need to depend on platform-specific
behaviour for the return value's lifetime. We'll make use of this
function in a following patch, so let's make it available by making it
non-static and adding a declaration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 13:03:45 -08:00
1ff21c05ba config: store "git -c" variables using more robust format
The previous commit added a new format for $GIT_CONFIG_PARAMETERS which
is able to robustly handle subsections with "=" in them. Let's start
writing the new format. Unfortunately, this does much less than you'd
hope, because "git -c" itself has the same ambiguity problem! But it's
still worth doing:

  - we've now pushed the problem from the inter-process communication
    into the "-c" command-line parser. This would free us up to later
    add an unambiguous format there (e.g., separate arguments like "git
    --config key value", etc).

  - for --config-env, the parser already disallows "=" in the
    environment variable name. So:

      git --config-env section.with=equals.key=ENVVAR

    will robustly set section.with=equals.key to the contents of
    $ENVVAR.

The new test shows the improvement for --config-env.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 13:03:18 -08:00
f9dbb64fad config: parse more robust format in GIT_CONFIG_PARAMETERS
When we stuff config options into GIT_CONFIG_PARAMETERS, we shell-quote
each one as a single unit, like:

  'section.one=value1' 'section.two=value2'

On the reading side, we de-quote to get the individual strings, and then
parse them by splitting on the first "=" we find. This format is
ambiguous, because an "=" may appear in a subsection. So the config
represented in a file by both:

  [section "subsection=with=equals"]
  key = value

and:

  [section]
  subsection = with=equals.key=value

ends up in this flattened format like:

  'section.subsection=with=equals.key=value'

and we can't tell which was desired. We have traditionally resolved this
by taking the first "=" we see starting from the left, meaning that we
allowed arbitrary content in the value, but not in the subsection.

Let's make our environment format a bit more robust by separately
quoting the key and value. That turns those examples into:

  'section.subsection=with=equals.key'='value'

and:

  'section.subsection'='with=equals.key=value'

respectively, and we can tell the difference between them. We can detect
which format is in use for any given element of the list based on the
presence of the unquoted "=". That means we can continue to allow the
old format to work to support any callers which manually used the old
format, and we can even intermingle the two formats. The old format
wasn't documented, and nobody was supposed to be using it. But it's
likely that such callers exist in the wild, so it's nice if we can avoid
breaking them. Likewise, it may be possible to trigger an older version
of "git -c" that runs a script that calls into a newer version of "git
-c"; that new version would see the intermingled format.

This does create one complication, which is that the obvious format in
the new scheme for

  [section]
  some-bool

is:

  'section.some-bool'

with no equals. We'd mistake that for an old-style variable. And it even
has the same meaning in the old style, but:

  [section "with=equals"]
  some-bool

does not. It would be:

  'section.with=equals=some-bool'

which we'd take to mean:

  [section]
  with = equals=some-bool

in the old, ambiguous style. Likewise, we can't use:

  'section.some-bool'=''

because that's ambiguous with an actual empty string. Instead, we'll
again use the shell-quoting to give us a hint, and use:

  'section.some-bool'=

to show that we have no value.

Note that this commit just expands the reading side. We'll start writing
the new format via "git -c" in a future patch. In the meantime, the
existing "git -c" tests will make sure we didn't break reading the old
format. But we'll also add some explicit coverage of the two formats to
make sure we continue to handle the old one after we move the writing
side over.

And one final note: since we're now using the shell-quoting as a
semantically meaningful hint, this closes the door to us ever allowing
arbitrary shell quoting, like:

  'a'shell'would'be'ok'with'this'.key=value

But we have never supported that (only what sq_quote() would produce),
and we are probably better off keeping things simple, robust, and
backwards-compatible, than trying to make it easier for humans. We'll
continue not to advertise the format of the variable to users, and
instead keep "git -c" as the recommended mechanism for setting config
(even if we are trying to be kind not to break users who may be relying
on the current undocumented format).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 13:03:18 -08:00
2d02bc91c0 t4203: make blame output massaging more robust
In the "git blame --porcelain" output, lines that ends with three
integers may not be the line that shows a commit object with line
numbers and block length (the contents from the blamed file or the
summary field can have a line that happens to match).  Also, the
names of the author may have more than three SP separated tokens
("git blame -L242,+1 cf6de18aab Documentation/SubmittingPatches"
gives an example).  The existing "grep -E | cut" pipeline is a bit
too loose on these two points.

While they can be assumed on the test data, it is not so hard to
use the right pattern from the documented format, so let's do so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-14 21:54:52 -08:00
97f4b4c4e7 mailmap doc: use correct environment variable 'GIT_WORK_TREE'
gitmailmap(5) uses 'GIT_WORK_DIR' to refer to the root of the
repository, but this environment variable does not exist.

Use the correct spelling for that variable, 'GIT_WORK_TREE'.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-14 21:54:06 -08:00
779412b9d9 for_each_object_in_pack(): clarify pack vs index ordering
We may return objects in one of two orders: how they appear in the .idx
(sorted by object id) or how they appear in the packfile itself. To
further complicate matters, we have two ordering variables, "i" and
"pos", and it is not clear to which order they apply.

Let's clarify this by using an unambiguous name where possible, and
leaving a comment for the variable that does double-duty.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-14 18:22:27 -08:00
afa80f534b t4203: stop losing return codes of git commands
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that their failure is
reported.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-14 18:21:21 -08:00
f9f30a0310 test-lib-functions.sh: fix usage for test_commit()
The usage comment for test_commit() shows that the --author option
should be given as `--author=<author>`. However, this is incorrect as it
only works when given as `--author <author>`. Correct this erroneous
text.

Also, for the sake of correctness, fix the description as well since we
invoke `git commit` with `--author <author>`, not `--author=<author>`.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-14 18:21:03 -08:00
7c99bc23fc pack-write: die on error in write_promisor_file()
write_promisor_file() already uses xfopen(), so it would die
if the file cannot be opened for writing. To be consistent
with this behavior and not overlook issues, let's also die if
there are errors when we are actually writing to the file.

Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-14 17:02:22 -08:00
12aa5552a9 Merge branch 'en/ort-conflict-handling' into en/merge-ort-perf
* en/ort-conflict-handling:
  merge-ort: add handling for different types of files at same path
  merge-ort: copy find_first_merges() implementation from merge-recursive.c
  merge-ort: implement format_commit()
  merge-ort: copy and adapt merge_submodule() from merge-recursive.c
  merge-ort: copy and adapt merge_3way() from merge-recursive.c
  merge-ort: flesh out implementation of handle_content_merge()
  merge-ort: handle book-keeping around two- and three-way content merge
  merge-ort: implement unique_path() helper
  merge-ort: handle directory/file conflicts that remain
  merge-ort: handle D/F conflict where directory disappears due to merge
2021-01-14 12:41:54 -08:00
cafc587a1d Merge branch 'en/diffcore-rename' into en/merge-ort-perf
* en/diffcore-rename:
  diffcore-rename: remove unnecessary duplicate entry checks
  diffcore-rename: accelerate rename_dst setup
  diffcore-rename: simplify and accelerate register_rename_src()
  t4058: explore duplicate tree entry handling in a bit more detail
  t4058: add more tests and documentation for duplicate tree entry handling
  diffcore-rename: reduce jumpiness in progress counters
  diffcore-rename: simplify limit check
  diffcore-rename: avoid usage of global in too_many_rename_candidates()
  diffcore-rename: rename num_create to num_destinations
2021-01-14 12:41:45 -08:00
e5dcd78418 pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()'
To prepare for on-disk reverse indexes, remove a spot in
'offset_to_pack_pos()' that looks at the 'revindex' array in 'struct
packed_git'.

Even though this use of the revindex pointer is within pack-revindex.c,
this clean up is still worth doing. Since the 'revindex' pointer will be
NULL when reading from an on-disk reverse index (instead the
'revindex_data' pointer will be mmaped to the 'pack-*.rev' file), this
call-site would have to include a conditional to lookup the offset for
position 'mi' each iteration through the search.

So instead of open-coding 'pack_pos_to_offset()', call it directly from
within 'offset_to_pack_pos()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:48 -08:00
d5bc7c60c7 pack-revindex: hide the definition of 'revindex_entry'
Now that all spots outside of pack-revindex.c that reference 'struct
revindex_entry' directly have been removed, it is safe to hide the
implementation by moving it from pack-revindex.h to pack-revindex.c.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:48 -08:00
8389855a9b pack-revindex: remove unused 'find_revindex_position()'
Now that all 'find_revindex_position()' callers have been removed (and
converted to the more descriptive 'offset_to_pack_pos()'), it is almost
safe to get rid of 'find_revindex_position()' entirely. Almost, except
for the fact that 'offset_to_pack_pos()' calls
'find_revindex_position()'.

Inline 'find_revindex_position()' into 'offset_to_pack_pos()', and
then remove 'find_revindex_position()' entirely.

This is a straightforward refactoring with one minor snag.
'offset_to_pack_pos()' used to load the index before calling
'find_revindex_position()'. That means that by the time
'find_revindex_position()' starts executing, 'p->num_objects' can be
safely read. After inlining, be careful to not read 'p->num_objects'
until _after_ 'load_pack_revindex()' (which loads the index as a
side-effect) has been called.

Another small fix that is included is converting the upper- and
lower-bounds to be unsigned's instead of ints. This dates back to
92e5c77c37 (revindex: export new APIs, 2013-10-24)--ironically, the last
time we introduced new APIs here--but this unifies the types.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:48 -08:00
1c3855f33b pack-revindex: remove unused 'find_pack_revindex()'
Now that no callers of 'find_pack_revindex()' remain, remove the
function's declaration and implementation entirely.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:47 -08:00
2891b434ac builtin/gc.c: guess the size of the revindex
'estimate_repack_memory()' takes into account the amount of memory
required to load the reverse index in memory by multiplying the assumed
number of objects by the size of the 'revindex_entry' struct.

Prepare for hiding the definition of 'struct revindex_entry' by removing
a 'sizeof()' of that type from outside of pack-revindex.c. Instead,
guess that one off_t and one uint32_t are required per object. Strictly
speaking, this is a worse guess than asking for 'sizeof(struct
revindex_entry)' directly, since the true size of this struct is 16
bytes with padding on the end of the struct in order to align the offset
field.

But, this is an approximation anyway, and it does remove a use of the
'struct revindex_entry' from outside of pack-revindex internals.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:47 -08:00
b130aef65e for_each_object_in_pack(): convert to new revindex API
Avoid looking at the 'revindex' pointer directly and instead call
'pack_pos_to_index()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:47 -08:00
0a7e3642bc unpack_entry(): convert to new revindex API
Remove direct manipulation of the 'struct revindex_entry' type as well
as calls to the deprecated API in 'packfile.c:unpack_entry()'. Usual
clean-up is performed (replacing '->nr' with calls to
'pack_pos_to_index()' and so on).

Add an additional check to make sure that 'obj_offset()' points at a
valid object. In the case this check is violated, we cannot call
'mark_bad_packed_object()' because we don't know the OID. At the top of
the call stack is do_oid_object_info_extended() (via
packed_object_info()), which does mark the object.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:47 -08:00
fc150caf67 packed_object_info(): convert to new revindex API
Convert another call of 'find_pack_revindex()' to its replacement
'pack_pos_to_offset()'. Likewise:

  - Avoid manipulating `struct packed_git`'s `revindex` pointer directly
    by removing the pointer-as-array indexing.

  - Add an additional guard to check that the offset 'obj_offset()'
    points to a real object. This should be the case with well-behaved
    callers to 'packed_object_info()', but isn't guarenteed.

    Other blocks that fill in various other values from the 'struct
    object_info' request handle bad inputs by setting the type to
    'OBJ_BAD' and jumping to 'out'. Do the same when given a bad offset
    here.

    The previous code would have segfaulted when given a bad
    'obj_offset' value, since 'find_pack_revindex()' would return
    'NULL', and then the line that fills 'oi->disk_sizep' would try to
    access 'NULL[1]' with a stride of 16 bytes (the width of 'struct
    revindex_entry)'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:47 -08:00
3a3f54dd0a retry_bad_packed_offset(): convert to new revindex API
Perform exactly the same conversion as in the previous commit to another
caller within 'packfile.c'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:47 -08:00
45bef5c064 get_delta_base_oid(): convert to new revindex API
Replace direct accesses to the 'struct revindex' type with a call to
'pack_pos_to_index()'.

Likewise drop the old-style 'find_pack_revindex()' with its replacement
'offset_to_pack_pos()' (while continuing to perform the same error
checking).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:46 -08:00
78232bf65d rebuild_existing_bitmaps(): convert to new revindex API
Remove another instance of looking at the revindex directly by instead
calling 'pack_pos_to_index()'. Unlike other patches, this caller only
cares about the index position of each object in the loop.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:46 -08:00
011f3fd5cd try_partial_reuse(): convert to new revindex API
Remove another instance of direct revindex manipulation by calling
'pack_pos_to_offset()' instead (the caller here does not care about the
index position of the object at position 'pos').

Note that we cannot just use the existing "offset" variable to store the
value we get from pack_pos_to_offset(). It is incremented by
unpack_object_header(), but we later need the original value. Since
we'll no longer have revindex->offset to read it from, we'll store that
in a separate variable ("header" since it points to the entry's header
bytes).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:46 -08:00
a78a90324d get_size_by_pos(): convert to new revindex API
Remove another caller that holds onto a 'struct revindex_entry' by
replacing the direct indexing with calls to 'pack_pos_to_offset()' and
'pack_pos_to_index()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:46 -08:00
cf98f2e8e0 show_objects_for_type(): convert to new revindex API
Avoid storing the revindex entry directly, since this structure will
soon be removed from the public interface. Instead, store the offset and
index position by calling 'pack_pos_to_offset()' and
'pack_pos_to_index()', respectively.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:46 -08:00
57665086af bitmap_position_packfile(): convert to new revindex API
Replace find_revindex_position() with its counterpart in the new API,
offset_to_pack_pos().

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:45 -08:00
eb3fd99efd check_object(): convert to new revindex API
Replace direct accesses to the revindex with calls to
'offset_to_pack_pos()' and 'pack_pos_to_index()'.

Since this caller already had some error checking (it can jump to the
'give_up' label if it encounters an error), we can easily check whether
or not the provided offset points to an object in the given pack. This
error checking existed prior to this patch, too, since the caller checks
whether the return value from 'find_pack_revindex()' was NULL or not.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:45 -08:00
6a5c10c45f write_reused_pack_verbatim(): convert to new revindex API
Replace a direct access to the revindex array with
'pack_pos_to_offset()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:45 -08:00
66cbd3e2fb write_reused_pack_one(): convert to new revindex API
Replace direct revindex accesses with calls to 'pack_pos_to_offset()'
and 'pack_pos_to_index()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:45 -08:00
952fc6870d write_reuse_object(): convert to new revindex API
First replace 'find_pack_revindex()' with its replacement
'offset_to_pack_pos()'. This prevents any bogus OFS_DELTA that may make
its way through until 'write_reuse_object()' from causing a bad memory
read (if 'revidx' is 'NULL')

Next, replace a direct access of '->nr' with the wrapper function
'pack_pos_to_index()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:45 -08:00
f33fb6e419 pack-revindex: introduce a new API
In the next several patches, we will prepare for loading a reverse index
either in memory (mapping the inverse of the .idx's contents in-core),
or directly from a yet-to-be-introduced on-disk format. To prepare for
that, we'll introduce an API that avoids the caller explicitly indexing
the revindex pointer in the packed_git structure.

There are four ways to interact with the reverse index. Accordingly,
four functions will be exported from 'pack-revindex.h' by the time that
the existing API is removed. A caller may:

 1. Load the pack's reverse index. This involves opening up the index,
    generating an array, and then sorting it. Since opening the index
    can fail, this function ('load_pack_revindex()') returns an int.
    Accordingly, it takes only a single argument: the 'struct
    packed_git' the caller wants to build a reverse index for.

    This function is well-suited for both the current and new API.
    Callers will have to continue to open the reverse index explicitly,
    but this function will eventually learn how to detect and load a
    reverse index from the on-disk format, if one exists. Otherwise, it
    will fallback to generating one in memory from scratch.

 2. Convert a pack position into an offset. This operation is now
    called `pack_pos_to_offset()`. It takes a pack and a position, and
    returns the corresponding off_t.

    Any error simply calls BUG(), since the callers are not well-suited
    to handle a failure and keep going.

 3. Convert a pack position into an index position. Same as above; this
    takes a pack and a position, and returns a uint32_t. This operation
    is known as `pack_pos_to_index()`. The same thinking about error
    conditions applies here as well.

 4. Find the pack position for a given offset. This operation is now
    known as `offset_to_pack_pos()`. It takes a pack, an offset, and a
    pointer to a uint32_t where the position is written, if an object
    exists at that offset. Otherwise, -1 is returned to indicate
    failure.

    Unlike some of the callers that used to access '->offset' and '->nr'
    directly, the error checking around this call is somewhat more
    robust. This is important since callers should always pass an offset
    which points at the boundary of two objects. The API, unlike direct
    access, enforces that that is the case.

    This will become important in a subsequent patch where a caller
    which does not but could check the return value treats the signed
    `-1` from `find_revindex_position()` as an index into the 'revindex'
    array.

Two design warts are carried over into the new API:

  - Asking for the index position of an out-of-bounds object will result
    in a BUG() (since no such object exists), but asking for the offset
    of the non-existent object at the end of the pack returns the total
    size of the pack.

    This makes it convenient for callers who always want to take the
    difference of two adjacent object's offsets (to compute the on-disk
    size) but don't want to worry about boundaries at the end of the
    pack.

  - offset_to_pack_pos() lazily loads the reverse index, but
    pack_pos_to_index() doesn't (callers of the former are well-suited
    to handle errors, but callers of the latter are not).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:44 -08:00
0d28d3cf33 CoC: update to version 2.0 + local changes
Update the CoC added in 5cdf2301 (add a Code of Conduct document,
2019-09-24 from version 1.4 to version 2.0. This is the version found
at [1] with the following minor changes:

 - We preserve the change to the CoC in 3f9ef874a7 (CODE_OF_CONDUCT:
   mention individual project-leader emails, 2019-09-26)

 - We preserve the custom intro added in 5cdf2301d4 (add a Code of
   Conduct document, 2019-09-24)

This change intentionally preserves a warning emitted on "git diff
--check". It's better to make it easily diff-able with upstream than
to fix whitespace changes in our version while we're at it.

1. https://www.contributor-covenant.org/version/2/0/code_of_conduct/code_of_conduct.md

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Elijah Newren <newren@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylor.com>
Acked-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 17:45:04 -08:00
33add2ad7d fetch-pack: refactor writing promisor file
Let's replace the 2 different pieces of code that write a
promisor file in 'builtin/repack.c' and 'fetch-pack.c'
with a new function called 'write_promisor_file()' in
'pack-write.c' and 'pack.h'.

This might also help us in the future, if we want to put
back the ref names and associated hashes that were in
the promisor files we are repacking in 'builtin/repack.c'
as suggested by a NEEDSWORK comment just above the code
we are refactoring.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 16:01:07 -08:00
9d7fa3be31 fetch-pack: rename helper to create_promisor_file()
As we are going to refactor the code that actually writes
the promisor file into a separate function in a following
commit, let's rename the current write_promisor_file()
function to create_promisor_file().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 16:01:07 -08:00
4e168333a8 shortlog: remove unused(?) "repo-abbrev" feature
Remove support for the magical "repo-abbrev" comment in .mailmap
files. This was added to .mailmap parsing in [1], as a generalized
feature of the git-shortlog Perl script added earlier in [2].

There was no documentation or tests for this feature, and I don't
think it's used in practice anymore.

What it did was to allow you to specify a single string to be
search-replaced with "/.../" in the .mailmap file. E.g. for
linux.git's current .mailmap:

    git archive --remote=git@gitlab.com:linux-kernel/linux.git \
        HEAD -- .mailmap | grep -a repo-abbrev
    # repo-abbrev: /pub/scm/linux/kernel/git/

Then when running e.g.:

    git shortlog --merges --author=Linus -1 v5.10-rc7..v5.10 | grep Merge

We'd emit (the [...] is mine):

      Merge tag [...]git://git.kernel.org/.../tip/tip

But will now emit:

      Merge tag [...]git.kernel.org/pub/scm/linux/kernel/git/tip/tip

I think at this point this is just a historical artifact we can get
rid of. It was initially meant for Linus's own use when we integrated
the Perl script[2], but since then it seems he's stopped using it.

Digging through Linus's release announcements on the LKML[3] the last
release I can find that made use of this output is Linux 2.6.25-rc6
back in March 2008[4]. Later on Linus started using --no-merges[5],
and nowadays seems to prefer some custom not-quite-shortlog format of
merges from lieutenants[6].

You will still see it on linux.git if you run "git shortlog" manually
yourself with --merges, with this removed you can still get the same
output with:

    git log --pretty=fuller v5.10-rc7..v5.10 |
    sed 's!/pub/scm/linux/kernel/git/!/.../!g' |
    git shortlog

Arguably we should do the same for the search-replacing of "[PATCH]"
at the beginning with "". That seems to be another relic of a bygone
era when linux.git patches would have their E-Mail subject lines
applied as-is by "git am" or whatever. But we documented that feature
in "git-shortlog(1)", and it seems more widely applicable than
something purely kernel-specific.

1. 7595e2ee6e (git-shortlog: make common repository prefix
   configurable with .mailmap, 2006-11-25)
2. fa375c7f1b (Add git-shortlog perl script, 2005-06-04)
3. https://lore.kernel.org/lkml/
4. https://lore.kernel.org/lkml/alpine.LFD.1.00.0803161651350.3020@woody.linux-foundation.org/
5. https://lore.kernel.org/lkml/BANLkTinrbh7Xi27an3uY7pDWrNKhJRYmEA@mail.gmail.com/
6. https://lore.kernel.org/lkml/CAHk-=wg1+kf1AVzXA-RQX0zjM6t9J2Kay9xyuNqcFHWV-y5ZYw@mail.gmail.com/

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
238803cb40 mailmap doc + tests: document and test for case-insensitivity
Add documentation and more tests for case-insensitivity. The existing
test only matched on the E-Mail part, but as shown here we also match
the name with strcasecmp().

This behavior was last discussed on the mailing list in the thread
starting at [1]. It seems we're keeping it like this, so let's
document it.

1. https://lore.kernel.org/git/87czykvg19.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
34986b773a mailmap tests: add tests for empty "<>" syntax
Add tests for mailmap's handling of "<>", which is allowed on the RHS,
but not the LHS of a "<LHS> <RHS>" pair.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
9e2a14a889 mailmap tests: add tests for whitespace syntax
Add tests for mailmap's handling of whitespace, i.e. how it trims
space within "<>" and around author names.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
9b391b09a0 mailmap tests: add a test for comment syntax
Add a test for mailmap comment syntax. As noted in [1] there was no
test coverage for this. Let's make sure a future change doesn't break
it.

1. https://lore.kernel.org/git/CAN0heSoKYWXqskCR=GPreSHc6twCSo1345WTmiPdrR57XSShhA@mail.gmail.com/

Reported-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
05b5ff219c mailmap doc + tests: add better examples & test them
Change the mailmap documentation added in 0925ce4d49 (Add map_user()
and clear_mailmap() to mailmap, 2009-02-08) to continue discussing the
Jane/Joe example. I think this makes things a lot less confusing as
we're building up more complex examples using one set of data which
covers all the things we'd like to discuss.

Also add tests to assert that what our documentation says is what's
actually happening. This is mostly (or entirely) covered by existing
tests which I'm not deleting, but having these tests for the synopsis
makes it easier to follow-along while reading the tests & docs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
f5d79bf7dd tests: refactor a few tests to use "test_commit --append"
Refactor a few more tests to use the new "--append" option to
"test_commit". I added it for use in the mailmap tests, but this
demonstrates how useful it is in general.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
3373518cc8 test-lib functions: add an --append option to test_commit
Add an --append option to test_commit to append <contents> to the
<file> we're writing to. This simplifies a lot of test setup, as shown
in some of the tests being changed here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
999cfc4f45 test-lib functions: add --author support to test_commit
Add support for --author to "test_commit". This will simplify some
current and future tests, one of those is being changed here.

Let's also line-wrap the "git commit" command invocation to make diffs
that add subsequent options easier to add, as they'll only need to add
a new option line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
76b8b8d05c test-lib functions: document arguments to test_commit
The --notick argument was added in [1] and was followed by --signoff
in [2], but neither of these commits added any documentation for these
options. When -C was added in [3] a comment was added to document it,
but not the other options. Let's document all of these options.

1. 44b85e89d7 (t7003: add test to filter a branch with a commit at
   epoch, 2012-07-12),
2. 5ed75e2a3f (cherry-pick: don't forget -s on failure, 2012-09-14).
3. 6f94351b0a (test-lib-functions.sh: teach test_commit -C <dir>,
   2016-12-08)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
f21426e189 test-lib functions: expand "test_commit" comment template
Expand the comment template for "test_commit" to match that of
"test_commit_bulk" added in b1c36cb849 (test-lib: introduce
test_commit_bulk, 2019-07-02). It has several undocumented options,
which won't all fit on one line. Follow-up commit(s) will document
them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
56ac194e1d mailmap: test for silent exiting on missing file/blob
That we silently ignore missing mailmap.file or mailmap.blob values is
intentional. See 938a60d64f (mailmap: clean up read_mailmap error
handling, 2012-12-12). However, nothing tested for this. Let's do that
by checking that stderr is empty in those cases.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
c1fe7fd7e3 mailmap tests: get rid of overly complex blame fuzzing
Change a test that used a custom fuzzing function since
bfdfa3d414 (t4203 (mailmap): stop hardcoding commit ids and dates,
2010-10-15) to just use the "blame --porcelain" output instead.

We could use the same pattern as 0ba9c9a0fb (t8008: rely on
rev-parse'd HEAD instead of sha1 value, 2017-07-26) does to do this,
but there wouldn't be any point. We're not trying to test "blame"
output here in general, just that "blame" pays attention to the
mailmap.

So it's sufficient to get the blamed line(s) and authors from the
output, which is much easier with the "--porcelain" option.

It would still be possible for there to be a bug in "blame" such that
it uses the mailmap for its "--porcelain" output, but not the regular
output. Let's test for that simply by checking if specifying the
mailmap changes the output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
400d160e39 mailmap tests: add a test for "not a blob" error
Add a test for one of the error conditions added in
938a60d64f (mailmap: clean up read_mailmap error handling,
2012-12-12).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
fb3bbe4ea3 mailmap tests: remove redundant entry in test
Remove a redundant line in a test added in d20d654fe8 (Change current
mailmap usage to do matching on both name and email of
author/committer., 2009-02-08).

This didn't conceivably test anything useful and is most likely a
copy/paste error.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
1db421ab85 mailmap tests: improve --stdin tests
The --stdin tests setup the "contact" file in the main setup, let's
instead set it up in the test that uses it.

Also refactor the first test so it's obvious that the point of it is
that "check-mailmap" will spew its input as-is when given no
argument. For that one we can just use the "expect" file as-is.

Also add tests for how other "--stdin" cases are handled, e.g. one
where we actually do a mapping.

For the rest of --stdin testing we just assume we're going to get the
same output. We could follow-up and make sure everything's
round-tripped through both --stdin and the file/blob backends, but I
don't think there's much point in that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
e9931ace4f mailmap tests: modernize syntax & test idioms
Refactor the mailmap tests to:

 * Setup "actual" test files in the body of "test_expect_success"

 * Don't have X of "test_expect_success X Y" be an unquoted string.

 * Not to carry over test config between tests, and instead use
   "test_config".

 * Replace various "echo" a line-at-a-time patterns with here-docs.

 * Change a case of "log.mailmap=False" to use the lower-case
   "false". Both work, but this ends up in git-config's boolean
   parsing and these atypical values are tested for elsewhere. Let's
   use the lower-case to not draw the reader's attention to this
   abnormality.

 * Remove commentary asserting that things work a given way in favor
   of simply testing for it, i.e. in the case of a .mailmap file
   outside of the repository.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
9aaeac9cf7 mailmap tests: use our preferred whitespace syntax
Change these tests to use the preferred whitespace around ">",
"<<-EOF" etc. This is an initial step in larger and more meaningful
refactoring of the file, which makes a subsequent commit easier to
read.

I'm not changing the whitespace of "echo <str> > file" patterns to
"echo <str> >file" because all of those will be changed to here-docs
in a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
fcafb75382 mailmap doc: start by mentioning the comment syntax
Mentioning the comment syntax and blank line support first is in line
with how "git help config" describes its format. See
b8936cf060 (config.txt grammar, typo, and asciidoc fixes, 2006-06-08)
for the paragraph I'm copying & amending from its documentation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
6646cca892 check-mailmap doc: note config options
Add a passing mention of the mailmap.file and mailmap.blob
configuration options. Before this addition a reader of the
"check-mailmap" manpage would have no idea that a custom map could be
specified, unless they'd happen to e.g. come across it in the "config"
manpage first.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
4f2ee994f3 mailmap doc: quote config variables like.this
Quote the mailmap.file and mailmap.blob configuration variables as
`mailmap.file` and `mailmap.blob`, and link to git-config(1). This is
in line with the preferred way of doing this in the rest of our
documentation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
42957af027 mailmap doc: create a new "gitmailmap(5)" man page
Create a gitmailmap(5) page similar to how .gitmodules and .gitignore
have their own pages at gitmodules(5) and gitignore(5). Now instead of
"check-mailmap", "blame" and "shortlog" documentation including the
description of the format we link to one canonical place.

This makes things easier for readers, since in our manpage or
web-based[1] output it's not clear that the "MAPPING AUTHORS" sections
aren't subtly different, as opposed to just included.

1. https://git-scm.com/docs/git-check-mailmap

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:39 -08:00
c7b190dabd fetch: implement support for atomic reference updates
When executing a fetch, then git will currently allocate one reference
transaction per reference update and directly commit it. This means that
fetches are non-atomic: even if some of the reference updates fail,
others may still succeed and modify local references.

This is fine in many scenarios, but this strategy has its downsides.

- The view of remote references may be inconsistent and may show a
  bastardized state of the remote repository.

- Batching together updates may improve performance in certain
  scenarios. While the impact probably isn't as pronounced with loose
  references, the upcoming reftable backend may benefit as it needs to
  write less files in case the update is batched.

- The reference-update hook is currently being executed twice per
  updated reference. While this doesn't matter when there is no such
  hook, we have seen severe performance regressions when doing a
  git-fetch(1) with reference-transaction hook when the remote
  repository has hundreds of thousands of references.

Similar to `git push --atomic`, this commit thus introduces atomic
fetches. Instead of allocating one reference transaction per updated
reference, it causes us to only allocate a single transaction and commit
it as soon as all updates were received. If locking of any reference
fails, then we abort the complete transaction and don't update any
reference, which gives us an all-or-nothing fetch.

Note that this may not completely fix the first of above downsides, as
the consistent view also depends on the server-side. If the server
doesn't have a consistent view of its own references during the
reference negotiation phase, then the client would get the same
inconsistent view the server has. This is a separate problem though and,
if it actually exists, can be fixed at a later point.

This commit also changes the way we write FETCH_HEAD in case `--atomic`
is passed. Instead of writing changes as we go, we need to accumulate
all changes first and only commit them at the end when we know that all
reference updates succeeded. Ideally, we'd just do so via a temporary
file so that we don't need to carry all updates in-memory. This isn't
trivially doable though considering the `--append` mode, where we do not
truncate the file but simply append to it. And given that we support
concurrent processes appending to FETCH_HEAD at the same time without
any loss of data, seeding the temporary file with current contents of
FETCH_HEAD initially and then doing a rename wouldn't work either. So
this commit implements the simple strategy of buffering all changes and
appending them to the file on commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:15 -08:00
d4c8db8f1b fetch: allow passing a transaction to s_update_ref()
The handling of ref updates is completely handled by `s_update_ref()`,
which will manage the complete lifecycle of the reference transaction.
This is fine right now given that git-fetch(1) does not support atomic
fetches, so each reference gets its own transaction. It is quite
inflexible though, as `s_update_ref()` only knows about a single
reference update at a time, so it doesn't allow us to alter the
strategy.

This commit prepares `s_update_ref()` and its only caller
`update_local_ref()` to allow passing an external transaction. If none
is given, then the existing behaviour is triggered which creates a new
transaction and directly commits it. Otherwise, if the caller provides a
transaction, then we only queue the update but don't commit it. This
optionally allows the caller to manage when a transaction will be
committed.

Given that `update_local_ref()` is always called with a `NULL`
transaction for now, no change in behaviour is expected from this
change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:15 -08:00
c45889f104 fetch: refactor s_update_ref to use common exit path
The cleanup code in `s_update_ref()` is currently duplicated for both
succesful and erroneous exit paths. This commit refactors the function
to have a shared exit path for both cases to remove the duplication.

Suggested-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:15 -08:00
929d044575 fetch: use strbuf to format FETCH_HEAD updates
This commit refactors `append_fetch_head()` to use a `struct strbuf` for
formatting the update which we're about to append to the FETCH_HEAD
file. While the refactoring doesn't have much of a benefit right now, it
serves as a preparatory step to implement atomic fetches where we need
to buffer all updates to FETCH_HEAD and only flush them out if all
reference updates succeeded.

No change in behaviour is expected from this commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:14 -08:00
58a646a368 fetch: extract writing to FETCH_HEAD
When performing a fetch with the default `--write-fetch-head` option, we
write all updated references to FETCH_HEAD while the updates are
performed. Given that updates are not performed atomically, it means
that we we write to FETCH_HEAD even if some or all of the reference
updates fail.

Given that we simply update FETCH_HEAD ad-hoc with each reference, the
logic is completely contained in `store_update_refs` and thus quite hard
to extend. This can already be seen by the way we skip writing to the
FETCH_HEAD: instead of having a conditional which simply skips writing,
we instead open "/dev/null" and needlessly write all updates there.

We are about to extend git-fetch(1) to accept an `--atomic` flag which
will make the fetch an all-or-nothing operation with regards to the
reference updates. This will also require us to make the updates to
FETCH_HEAD an all-or-nothing operation, but as explained doing so is not
easy with the current layout. This commit thus refactors the wa we write
to FETCH_HEAD and pulls out the logic to open, append to, commit and
close the file. While this may seem rather over-the top at first,
pulling out this logic will make it a lot easier to update the code in a
subsequent commit. It also allows us to easily skip writing completely
in case `--no-write-fetch-head` was passed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:14 -08:00
b342ae61b3 config: extract function to parse config pairs
The function `git_config_parse_parameter` is responsible for parsing a
`foo.bar=baz`-formatted configuration key, sanitizing the key and then
processing it via the given callback function. Given that we're about to
add a second user which is going to process keys which already has keys
and values separated, this commit extracts a function
`config_parse_pair` which only does the sanitization and processing
part as a preparatory step.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:03:18 -08:00
13c44953fb quote: make sq_dequote_step() a public function
We provide a function for dequoting an entire string, as well as one for
handling a space-separated list of quoted strings. But there's no way
for a caller to parse a string like 'foo'='bar', even though it is easy
to generate one using sq_quote_buf() or similar.

Let's make the single-step function available to callers outside of
quote.c. Note that we do need to adjust its implementation slightly: it
insists on seeing whitespace between items, and we'd like to be more
flexible than that. Since it only has a single caller, we can move that
check (and slurping up any extra whitespace) into that caller.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:03:18 -08:00
ce81b1da23 config: add new way to pass config via --config-env
While it's already possible to pass runtime configuration via `git -c
<key>=<value>`, it may be undesirable to use when the value contains
sensitive information. E.g. if one wants to set `http.extraHeader` to
contain an authentication token, doing so via `-c` would trivially leak
those credentials via e.g. ps(1), which typically also shows command
arguments.

To enable this usecase without leaking credentials, this commit
introduces a new switch `--config-env=<key>=<envvar>`. Instead of
directly passing a value for the given key, it instead allows the user
to specify the name of an environment variable. The value of that
variable will then be used as value of the key.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:03:18 -08:00
5bb0fd2cab bundle: arguments can be read from stdin
In order to create an incremental bundle, we need to pass many arguments
to let git-bundle ignore some already packed commits.  It will be more
convenient to pass args via stdin.  But the current implementation does
not allow us to do this.

This is because args are parsed twice when creating bundle.  The first
time for parsing args is in `compute_and_write_prerequisites()` by
running `git-rev-list` command to write prerequisites in bundle file,
and stdin is consumed in this step if "--stdin" option is provided for
`git-bundle`.  Later nothing can be read from stdin when running
`setup_revisions()` in `create_bundle()`.

The solution is to parse args once by removing the entire function
`compute_and_write_prerequisites()` and then calling function
`setup_revisions()`.  In order to write prerequisites for bundle, will
call `prepare_revision_walk()` and `traverse_commit_list()`.  But after
calling `prepare_revision_walk()`, the object array `revs.pending` is
left empty, and the following steps could not work properly with the
empty object array (`revs.pending`).  Therefore, make a copy of `revs`
to `revs_copy` for later use right after calling `setup_revisions()`.

The copy of `revs_copy` is not a deep copy, it shares the same objects
with `revs`. The object array of `revs` has been cleared, but objects
themselves are still kept.  Flags of objects may change after calling
`prepare_revision_walk()`, we can use these changed flags without
calling the `git rev-list` command and parsing its output like the
former implementation.

Also add testcases for git bundle in t6020, which read args from stdin.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-11 21:50:41 -08:00
ce1d6d9f16 bundle: lost objects when removing duplicate pendings
`git rev-list` will list one commit for the following command:

    $ git rev-list 'main^!'
    <tip-commit-of-main-branch>

But providing the same rev-list args to `git bundle`, fail to create
a bundle file.

    $ git bundle create - 'main^!'
    # v2 git bundle
    -<OID> <one-line-message>

    fatal: Refusing to create empty bundle.

This is because when removing duplicate objects in function
`object_array_remove_duplicates()`, one unique pending object which has
the same name is deleted by mistake.  The revision arg 'main^!' in the
above example is parsed by `handle_revision_arg()`, and at lease two
different objects will be appended to `revs.pending`, one points to the
parent commit of the "main" branch, and the other points to the tip
commit of the "main" branch.  These two objects have the same name
"main".  Only one object is left with the name "main" after calling the
function `object_array_remove_duplicates()`.

And what's worse, when adding boundary commits into pending list, we use
one-line commit message as names, and the arbitory names may surprise
git-bundle.

Only comparing objects themselves (".item") is also not good enough,
because user may want to create a bundle with two identical objects but
with different reference names, such as: "HEAD" and "refs/heads/main".

Add new function `contains_object()` which compare both the address and
the name of the object.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-11 21:50:41 -08:00
9901164d81 test: add helper functions for git-bundle
Move git-bundle related functions from t5510 to a library, and this lib
will be shared with a new testcase t6020 which finds a known breakage of
"git-bundle".

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-11 21:50:41 -08:00
6436a20284 refs: allow @{n} to work with n-sized reflog
This sequence works

	$ git checkout -b newbranch
	$ git commit --allow-empty -m one
	$ git show -s newbranch@{1}

and shows the state that was immediately after the newbranch was
created.

But then if you do

	$ git reflog expire --expire=now refs/heads/newbranch
	$ git commit --allow=empty -m two
	$ git show -s newbranch@{1}

you'd be scolded with

	fatal: log for 'newbranch' only has 1 entries

While it is true that it has only 1 entry, we have enough
information in that single entry that records the transition between
the state in which the tip of the branch was pointing at commit
'one' to the new commit 'two' built on it, so we should be able to
answer "what object newbranch was pointing at?". But we refuse to
do so.

Make @{0} the special case where we use the new side to look up that
entry. Otherwise, look up @{n} using the old side of the (n-1)th entry
of the reflog.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-11 14:13:50 -08:00
95c2a71820 refs: factor out set_read_ref_cutoffs()
This block of code is duplicated twice. In a future commit, it will be
duplicated for a third time. Factor out the common functionality into
set_read_ref_cutoffs().

In the case of read_ref_at_ent(), we are incrementing `cb->reccnt` at the
beginning of the function. Move these to right before the return so that
the `cb->reccnt - 1` is changed to `cb->reccnt` and it can be cleanly
factored out into set_read_ref_cutoffs(). The duplication of the
increment statements will be removed in a future patch.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-10 12:24:00 -08:00
e3f5da7e60 t7800-difftool: don't accidentally match tmp dirs
In a bunch of test cases in 't7800-difftool.sh' we 'grep' for specific
filenames in 'git difftool's output, and those test cases are prone to
occasional failures because those filenames might be part of the name
of difftool's temporary directory as well, e.g.:

  +git difftool --dir-diff --no-symlinks --extcmd ls v1
  +grep sub output
  +test_line_count = 2 sub-output
  test_line_count: line count for sub-output != 2
  /tmp/git-difftool.Ssubfq/left/:
  sub
  /tmp/git-difftool.Ssubfq/right/:
  sub
  error: last command exited with $?=1
  not ok 50 - difftool --dir-diff v1 from subdirectory --no-symlinks

Fix this by tightening the 'grep' patterns looking for those
interesting filenames to match only lines where a filename stands on
its own.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-09 13:40:32 -08:00
eb3e3e1ddf merge-ort: collect which directories are removed in dirs_removed
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:30:03 -08:00
f5d9fbc2e9 merge-ort: initialize and free new directory rename data structures
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:30:03 -08:00
c09376d55f merge-ort: add new data structures for directory rename detection
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:30:02 -08:00
8f894b2263 Merge branch 'en/merge-ort-3' into en/ort-directory-rename
* en/merge-ort-3:
  merge-ort: add implementation of type-changed rename handling
  merge-ort: add implementation of normal rename handling
  merge-ort: add implementation of rename collisions
  merge-ort: add implementation of rename/delete conflicts
  merge-ort: add implementation of both sides renaming differently
  merge-ort: add implementation of both sides renaming identically
  merge-ort: add basic outline for process_renames()
  merge-ort: implement compare_pairs() and collect_renames()
  merge-ort: implement detect_regular_renames()
  merge-ort: add initial outline for basic rename detection
  merge-ort: add basic data structures for handling renames
2021-01-07 15:29:49 -08:00
72c4083ddf The first batch in 2.31 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 23:33:44 -08:00
d3aff11c3e Merge branch 'es/perf-export-fix'
Tweak unneeded recursion from a test framework helper function.

* es/perf-export-fix:
  t/perf: avoid unnecessary test_export() recursion
2021-01-06 23:33:44 -08:00
cf4b0714f7 Merge branch 'fc/t6030-bisect-reset-removes-auxiliary-files'
A 3-year old test that was not testing anything useful has been
corrected.

* fc/t6030-bisect-reset-removes-auxiliary-files:
  test: bisect-porcelain: fix location of files
2021-01-06 23:33:44 -08:00
8664fcb83b Merge branch 'es/worktree-repair-both-moved'
"git worktree repair" learned to deal with the case where both the
repository and the worktree moved.

* es/worktree-repair-both-moved:
  worktree: teach `repair` to fix multi-directional breakage
2021-01-06 23:33:44 -08:00
45a177069f Merge branch 'en/merge-ort-recursive'
The ORT merge strategy learned to synthesize virtual ancestor tree
by recursively merging multiple merge bases together, just like the
recursive backend has done for years.

* en/merge-ort-recursive:
  merge-ort: implement merge_incore_recursive()
  merge-ort: make clear_internal_opts() aware of partial clearing
  merge-ort: copy a few small helper functions from merge-recursive.c
  commit: move reverse_commit_list() from merge-recursive
2021-01-06 23:33:44 -08:00
d3fa84d528 Merge branch 'fc/pull-merge-rebase'
When a user does not tell "git pull" to use rebase or merge, the
command gives a loud message telling a user to choose between
rebase or merge but creates a merge anyway, forcing users who would
want to rebase to redo the operation.  Fix an early part of this
problem by tightening the condition to give the message---there is
no reason to stop or force the user to choose between rebase or
merge if the history fast-forwards.

* fc/pull-merge-rebase:
  pull: display default warning only when non-ff
  pull: correct condition to trigger non-ff advice
  pull: get rid of unnecessary global variable
  pull: give the advice for choosing rebase/merge much later
  pull: refactor fast-forward check
2021-01-06 23:33:44 -08:00
85cf82ff01 Merge branch 'en/merge-ort-2'
More "ORT" merge strategy.

* en/merge-ort-2:
  merge-ort: add modify/delete handling and delayed output processing
  merge-ort: add die-not-implemented stub handle_content_merge() function
  merge-ort: add function grouping comments
  merge-ort: add a paths_to_free field to merge_options_internal
  merge-ort: add a path_conflict field to merge_options_internal
  merge-ort: add a clear_internal_opts helper
  merge-ort: add a few includes
2021-01-06 23:33:44 -08:00
f9d29daba6 Merge branch 'en/merge-ort-impl'
The merge backend "done right" starts to emerge.

* en/merge-ort-impl:
  merge-ort: free data structures in merge_finalize()
  merge-ort: add implementation of record_conflicted_index_entries()
  tree: enable cmp_cache_name_compare() to be used elsewhere
  merge-ort: add implementation of checkout()
  merge-ort: basic outline for merge_switch_to_result()
  merge-ort: step 3 of tree writing -- handling subdirectories as we go
  merge-ort: step 2 of tree writing -- function to create tree object
  merge-ort: step 1 of tree writing -- record basenames, modes, and oids
  merge-ort: have process_entries operate in a defined order
  merge-ort: add a preliminary simple process_entries() implementation
  merge-ort: avoid recursing into identical trees
  merge-ort: record stage and auxiliary info for every path
  merge-ort: compute a few more useful fields for collect_merge_info
  merge-ort: avoid repeating fill_tree_descriptor() on the same tree
  merge-ort: implement a very basic collect_merge_info()
  merge-ort: add an err() function similar to one from merge-recursive
  merge-ort: use histogram diff
  merge-ort: port merge_start() from merge-recursive
  merge-ort: add some high-level algorithm structure
  merge-ort: setup basic internal data structures
2021-01-06 23:33:43 -08:00
c256631065 Merge branch 'tb/pack-bitmap'
Various improvements to the codepath that writes out pack bitmaps.

* tb/pack-bitmap: (24 commits)
  pack-bitmap-write: better reuse bitmaps
  pack-bitmap-write: relax unique revwalk condition
  pack-bitmap-write: use existing bitmaps
  pack-bitmap: factor out 'add_commit_to_bitmap()'
  pack-bitmap: factor out 'bitmap_for_commit()'
  pack-bitmap-write: ignore BITMAP_FLAG_REUSE
  pack-bitmap-write: build fewer intermediate bitmaps
  pack-bitmap.c: check reads more aggressively when loading
  pack-bitmap-write: rename children to reverse_edges
  t5310: add branch-based checks
  commit: implement commit_list_contains()
  bitmap: implement bitmap_is_subset()
  pack-bitmap-write: fill bitmap with commit history
  pack-bitmap-write: pass ownership of intermediate bitmaps
  pack-bitmap-write: reimplement bitmap writing
  ewah: add bitmap_dup() function
  ewah: implement bitmap_or()
  ewah: make bitmap growth less aggressive
  ewah: factor out bitmap growth
  rev-list: die when --test-bitmap detects a mismatch
  ...
2021-01-06 23:33:43 -08:00
b62bbd3580 Merge branch 'ab/trailers-extra-format'
The "--format=%(trailers)" mechanism gets enhanced to make it
easier to design output for machine consumption.

* ab/trailers-extra-format:
  pretty format %(trailers): add a "key_value_separator"
  pretty format %(trailers): add a "keyonly"
  pretty-format %(trailers): fix broken standalone "valueonly"
  pretty format %(trailers) doc: avoid repetition
  pretty format %(trailers) test: split a long line
2021-01-06 23:33:43 -08:00
c977ff4407 Merge branch 'pk/subsub-fetch-fix-take-2'
"git fetch --recurse-submodules" fix (second attempt).

* pk/subsub-fetch-fix-take-2:
  submodules: fix of regression on fetching of non-init subsub-repo
2021-01-06 23:33:43 -08:00
b0812b6ac0 git: add --super-prefix to usage string
When the `--super-prefix` option was implmented in 74866d7579 (git: make
super-prefix option, 2016-10-07), its existence was only documented in
the manpage but not in the command's own usage string. Given that the
commit message didn't mention that this was done intentionally and given
that it's documented in the manpage, this seems like an oversight.

Add it to the usage string to fix the inconsistency.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 22:55:06 -08:00
06ce79152b mktag: add a --[no-]strict option
Now that mktag has been migrated to use the fsck machinery to check
its input, it makes sense to teach it to run in the equivalent of "git
fsck"'s default mode.

For cases where mktag is used to (re)create a tag object using data
from an existing and malformed tag object, the validation may
optionally have to be loosened. Teach the command to take the
"--[no-]strict" option to do so.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 14:22:24 -08:00
2aa9425fbe mktag: mark strings for translation
Mark the errors mktag might emit for translation. This is a plumbing
command, but the errors it emits are intended to be human-readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
3f390a366c mktag: convert to parse-options
Convert the "mktag" command to use parse-options.h instead of its own
ad-hoc argc handling. This doesn't matter much in practice since it
doesn't support any options, but removes another special-case in our
codebase, and makes it easier to add options to it in the future.

It does marginally improve the situation for programs that want to
execute git commands in a consistent manner and e.g. always use
--end-of-options. E.g. "gitaly" does that, and has a blacklist of
built-ins that don't support --end-of-options. This is one less
special case for it and other similar programs to support.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
9a1a3a4d4c mktag: allow omitting the header/body \n separator
Change mktag's acceptance rules to accept an empty body without an
empty line after the header again. This fixes an ancient unintended
dregression in "mktag".

When "mktag" was introduced in ec4465adb3 (Add "tag" objects that can
be used to sign other objects., 2005-04-25) the input checks were much
looser. When it was documented it 6cfec03680 (mktag: minimally update
the description., 2007-06-10) it was clearly intended for this \n to
be optional:

    The message, when [it] exists, is separated by a blank line from
    the header.

But then in e0aaf781f6 (mktag.c: improve verification of tagger field
and tests, 2008-03-27) this was made an error, seemingly by
accident. It was just a result of the general header checks, and all
the tests after that patch have a trailing empty line (but did not
before).

Let's allow this again, and tweak the test semantics changed in
e0aaf781f6 to remove the redundant empty line. New tests added in
previous commits of mine already added an explicit test for allowing
the empty line between header and body.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
acfc01332b mktag: allow turning off fsck.extraHeaderEntry
In earlier commits mktag learned to use the fsck machinery, at which
point we needed to add fsck.extraHeaderEntry so it could be as strict
about extra headers as it's been ever since it was implemented.

But it's not nice to need to switch away from "mktag" to "hash-object"
+ manual "fsck" just because you'd like to have an extra header. So
let's support turning it off by getting "fsck.*" variables from the
config.

Pedantically speaking it's still not possible to make "mktag" behave
just like "hash-object -t tag" does, since we're unconditionally going
to check the referenced object in verify_object_in_tag(), which is our
own check, and not one that exists in fsck.c.

But the spirit of "this works like fsck" is preserved, in that if you
created such a tag with "hash-object" and did a full "fsck" on the
repository it would also error out about that invalid object, it just
wouldn't emit the same message as fsck does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
1f3299fda9 fsck: make fsck_config() re-usable
Move the fsck_config() function from builtin/fsck.c to fsck.[ch]. This
allows for re-using it in other tools that expose fsck logic and want
to support its configuration variables.

A logical continuation of this change would be to use a common
function for all of {fetch,receive}.fsck.* and fsck.*. See
5d477a334a (fsck (receive-pack): allow demoting errors to warnings,
2015-06-22) and my own 1362df0d41 (fetch: implement fetch.fsck.*,
2018-07-27) for the relevant code.

However, those routines want to not parse the fsck.skipList into OIDs,
but rather pass them along with the --strict option to another
process. It would be possible to refactor that whole thing so we
support e.g. a "fetch." prefix, then just keep track of the skiplist
as a filename instead of parsing it, and learn to spew that all out
from our internal structures into something we can append to the
--strict option.

But instead I'm planning to re-use this in "mktag", which'll just
re-use these "fsck.*" variables as-is.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
acf9de4c94 mktag: use fsck instead of custom verify_tag()
Change the validation logic in "mktag" to use fsck's fsck_tag()
instead of its own custom parser. Curiously the logic for both dates
back to the same commit[1]. Let's unify them so we're not maintaining
two sets functions to verify that a tag is OK.

The behavior of fsck_tag() and the old "mktag" code being removed here
is different in few aspects.

I think it makes sense to remove some of those checks, namely:

 A. fsck only cares that the timezone matches [-+][0-9]{4}. The mktag
    code disallowed values larger than 1400.

    Yes there's currently no timezone with a greater offset[2], but
    since we allow any number of non-offical timezones (e.g. +1234)
    passing this through seems fine. Git also won't break in the
    future if e.g. French Polynesia decides it needs to outdo the Line
    Islands when it comes to timezone extravagance.

 B. fsck allows missing author names such as "tagger <email>", mktag
    wouldn't, but would allow e.g. "tagger [2 spaces] <email>" (but
    not "tagger [1 space] <email>"). Now we allow all of these.

 C. Like B, but "mktag" disallowed spaces in the <email> part, fsck
    allows it.

In some ways fsck_tag() is stricter than "mktag" was, namely:

 D. fsck disallows zero-padded dates, but mktag didn't care. So
    e.g. the timestamp "0000000000 +0000" produces an error now. A
    test in "t1006-cat-file.sh" relied on this, it's been changed to
    use "hash-object" (without fsck) instead.

There was one check I deemed worth keeping by porting it over to
fsck_tag():

 E. "mktag" did not allow any custom headers, and by extension (as an
    empty commit is allowed) also forbade an extra stray trailing
    newline after the headers it knew about.

    Add a new check in the "ignore" category to fsck and use it. This
    somewhat abuses the facility added in efaba7cc77 (fsck:
    optionally ignore specific fsck issues completely, 2015-06-22).

    This is somewhat of hack, but probably the least invasive change
    we can make here. The fsck command will shuffle these categories
    around, e.g. under --strict the "info" becomes a "warn" and "warn"
    becomes "error". Existing users of fsck's (and others,
    e.g. index-pack) --strict option rely on this.

    So we need to put something into a category that'll be ignored by
    all existing users of the API. Pretending that
    fsck.extraHeaderEntry=error ("ignore" by default) was set serves
    to do this for us.

1. ec4465adb3 (Add "tag" objects that can be used to sign other
   objects., 2005-04-25)

2. https://en.wikipedia.org/wiki/List_of_UTC_time_offsets

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
40ef015a27 mktag: use puts(str) instead of printf("%s\n", str)
This introduces no functional change, but refactors the print-out of
the hash at the end to do the same thing with less code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
dfe3948728 mktag: remove redundant braces in one-line body "if"
This minor stylistic churn is usually something we'd avoid, but if we
don't do this then the file after changes in subsequent commits will
only have this minor style inconsistency, so let's change this while
we're at it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
0c439117bb mktag: use default strbuf_read() hint
Change the hardcoded hint of 2^12 to 0. The default strbuf hint is
perfectly fine here, and the only reason we were hardcoding it is
because it survived migration from a pre-strbuf fixed-sized buffer.

See fd17f5b5f7 (Replace all read_fd use with strbuf_read, and get rid
of it., 2007-09-10) for that migration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
692654dca0 mktag tests: test verify_object() with replaced objects
Add tests to demonstrate what "mktag" does in the face of replaced
objects.

There was an existing test for replaced objects fed to "mktag" added
in cc400f5011 (mktag: call "check_sha1_signature" with the
replacement sha1, 2009-01-23), but that one only tests a
commit->commit mapping. Not a mapping to a different type as like
we're also testing for here. We could remove the "mktag" test in
t6050-replace.sh now if the created tag wasn't being used by a
subsequent "fsck" test.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
30f882c16d mktag tests: improve verify_object() test coverage
The verify_object() function in "mktag.c" is tasked with ensuring that
our tag refers to a valid object.

The existing test for this might fail because it was also testing that
"type taggg" didn't refer to a valid object type (it should be "type
tag"), or because we referred to a valid object but got the type
wrong.

Let's split these tests up, so we're testing all combinations of a
non-existing object and in invalid/wrong "type" lines.

We need to provide GIT_TEST_GETTEXT_POISON=false here because the
"invalid object type" error is emitted by
parse_loose_header_extended(), which has that message already marked
for translation. Another option would be to use test_i18ngrep, but I
prefer always running the test, not skipping it under gettext poison
testing.

I'm not testing this in combination with "git replace". That'll be
done in a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
ca9a1ed969 mktag tests: test "hash-object" compatibility
Change all the successful "mktag" tests to test that "hash-object"
produces the same hash for the input, and that fsck passes for
both.

This tests e.g. that "mktag" doesn't trim its input or otherwise munge
it in a way that "hash-object" doesn't.

Since we're doing an "fsck --strict" here at the end let's incorporate
the creation of the "mytag" name into this test, removing the
special-case at the end of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
47c95e77d1 mktag tests: stress test whitespace handling
Add tests for a couple of whitespace edge cases around the header/body
boundary.

I consider the requirement for a blank line before the empty body a
bug, it's a long-standing regression which goes against the command's
documented behavior. This bug will be addressed in a follow-up change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
3b9e4dd3a3 mktag tests: run "fsck" after creating "mytag"
Change the last test in the file to run an "fsck --strict" after
creating the tag at the end.

We're just doing this for good measure to check that fsck behaves as
expected now that there's finally a reference for our valid tag. Other
tests going to be checking this elsewhere, but it's nice to cover all
the edge cases in this test to make it as self-contained as possible.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
5c2303e0c7 mktag tests: don't create "mytag" twice
Change a test added in e0aaf781f6 (mktag.c: improve verification of
tagger field and tests, 2008-03-27) to not create "mytag", which
should only be created and verified at the end in an earlier test
added in 446c6faec6 (New tests and en-passant modifications to mktag.,
2006-07-29).

While we're at it let's prevent a similar logic error from creeping
into the test by asserting that "mytag" doesn't exist before we create
it. Let's do this by moving the test to use "update-ref", instead of
our own homebrew ad-hoc refstore update.

We're not really testing for anything yet by creating the tag at the
end here. A subsequent commit will change that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
317c176279 mktag tests: don't redirect stderr to a file needlessly
Remove the redirection of stderr to "message" in the valid tag
test. This pattern seems to have been copy/pasted from the failure
case in 446c6faec6 (New tests and en-passant modifications to mktag.,
2006-07-29).

While I'm at it do the same for the "replace" tests. The tag creation
I'm changing here seems to have been copy/pasted from the "mktag"
tests to those tests in cc400f5011 (mktag: call
"check_sha1_signature" with the replacement sha1, 2009-01-23).

Nobody examines the contents of the resulting "message" file, so the
net result is that error messages cannot be seen in "sh t3800-mktag.sh
-v" output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
0d35ccb5e0 mktag tests: remove needless SHA-1 hardcoding
Change the tests amended in acb49d1cc8 (t3800: make hash-size
independent, 2019-08-18) even more to make them independent of either
SHA-1 or SHA-256.

Some of these tests were failing for the wrong reasons. The first one
being modified here would fail because the line starts with "xxxxxx"
instead of "object", the rest of the line doesn't matter.

Let's just put a valid hash on the rest of the line anyway to narrow
the test down for just the s/object/xxxxxx/ case.

The second one being modified here would fail under
GIT_TEST_DEFAULT_HASH=sha256 because <some sha-1 length garbage> is an
invalid SHA-256, but we should really be testing <some sha-256 length
garbage> when under SHA-256.

This doesn't really matter since we should be able to trust other
parts of the code to validate things in the 0-9a-f range, but let's
keep it for good measure.

There's a later test which tests an invalid SHA which looks like a
valid one, to stress the "We refuse to tag something we can't
verify[...]" logic in mktag.c.

But here we're testing for a SHA-length string which contains
characters outside of the /[0-9a-f]/i set.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
b5ca549c93 mktag tests: use "test_commit" helper
Replace ad-hoc setup of a single commit in the "mktag" tests with our
standard helper pattern. The old setup dated back to 446c6faec6 (New
tests and en-passant modifications to mktag., 2006-07-29) before the
helper existed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
aba5377f69 mktag tests: don't needlessly use a subshell
The use of a subshell dates back to e9b20943b7 (t/t3800: do not use a
temporary file to hold expected result., 2008-01-04). It's not needed
anymore, if it ever was.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
18430ed363 mktag doc: update to explain why to use this
Change the mktag documentation to compare itself to the similar
"hash-object -t tag" command. Before this someone reading the
documentation wouldn't have much of an idea what the difference
was.

Let's allude to our own validation logic, and cross-link the "mktag"
and "hash-object" documentation to aid discover-ability. A follow-up
change to migrate "mktag" to use "fsck" validation will make the part
about validation logic clearer.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
3797a0a7b7 maintenance: use Windows scheduled tasks
Git's background maintenance uses cron by default, but this is not
available on Windows. Instead, integrate with Task Scheduler.

Tasks can be scheduled using the 'schtasks' command. There are several
command-line options that can allow for some advanced scheduling, but
unfortunately these seem to all require authenticating using a password.

Instead, use the "/xml" option to pass an XML file that contains the
configuration for the necessary schedule. These XML files are based on
some that I exported after constructing a schedule in the Task Scheduler
GUI. These options only run background maintenance when the user is
logged in, and more fields are populated with the current username and
SID at run-time by 'schtasks'.

Since the GIT_TEST_MAINT_SCHEDULER environment variable allows us to
specify 'schtasks' as the scheduler, we can test the Windows-specific
logic on other platforms. Thus, add a check that the XML file written
by Git is valid when xmllint exists on the system.

Since we use a temporary file for the XML files sent to 'schtasks', we
prefix the random characters with the frequency so it is easier to
examine the proper file during tests. Instead of an exact match on the
'args' file, we 'grep' for the arguments other than the filename.

There is a deficiency in the current design. Windows has two kinds of
applications: GUI applications that start by "winmain()" and console
applications that start by "main()". Console applications are attached
to a new Console window if they are not already associated with a GUI
application. This means that every hour the scheudled task launches a
command window for the scheduled tasks. Not only is this visually
obtrusive, but it also takes focus from whatever else the user is
doing!

A simple fix would be to insert a GUI application that acts as a shim
between the scheduled task and Git. This is currently possible in Git
for Windows by setting the <Command> tag equal to

  C:\Program Files\Git\git-bash.exe

with options "--hide --no-needs-console --command=cmd\git.exe"
followed by the arguments currently used. Since git-bash.exe is not
included in Windows builds of core Git, I chose to leave out this
feature. My plan is to submit a small patch to Git for Windows that
converts the use of git.exe with this use of git-bash.exe in the
short term. In the long term, we can consider creating this GUI
shim application within core Git, perhaps in contrib/.

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:38:02 -08:00
2afe7e3567 maintenance: use launchctl on macOS
The existing mechanism for scheduling background maintenance is done
through cron. The 'crontab -e' command allows updating the schedule
while cron itself runs those commands. While this is technically
supported by macOS, it has some significant deficiencies:

1. Every run of 'crontab -e' must request elevated privileges through
   the user interface. When running 'git maintenance start' from the
   Terminal app, it presents a dialog box saying "Terminal.app would
   like to administer your computer. Administration can include
   modifying passwords, networking, and system settings." This is more
   alarming than what we are hoping to achieve. If this alert had some
   information about how "git" is trying to run "crontab" then we would
   have some reason to believe that this dialog might be fine. However,
   it also doesn't help that some scenarios just leave Git waiting for
   a response without presenting anything to the user. I experienced
   this when executing the command from a Bash terminal view inside
   Visual Studio Code.

2. While cron initializes a user environment enough for "git config
   --global --show-origin" to show the correct config file information,
   it does not set up the environment enough for Git Credential Manager
   Core to load credentials during a 'prefetch' task. My prefetches
   against private repositories required re-authenticating through UI
   pop-ups in a way that should not be required.

The solution is to switch from cron to the Apple-recommended [1]
'launchd' tool.

[1] https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/ScheduledJobs.html

The basics of this tool is that we need to create XML-formatted
"plist" files inside "~/Library/LaunchAgents/" and then use the
'launchctl' tool to make launchd aware of them. The plist files
include all of the scheduling information, along with the command-line
arguments split across an array of <string> tags.

For example, here is my plist file for the weekly scheduled tasks:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0"><dict>
<key>Label</key><string>org.git-scm.git.weekly</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/libexec/git-core/git</string>
<string>--exec-path=/usr/local/libexec/git-core</string>
<string>for-each-repo</string>
<string>--config=maintenance.repo</string>
<string>maintenance</string>
<string>run</string>
<string>--schedule=weekly</string>
</array>
<key>StartCalendarInterval</key>
<array>
<dict>
<key>Day</key><integer>0</integer>
<key>Hour</key><integer>0</integer>
<key>Minute</key><integer>0</integer>
</dict>
</array>
</dict>
</plist>

The schedules for the daily and hourly tasks are more complicated
since we need to use an array for the StartCalendarInterval with
an entry for each of the six days other than the 0th day (to avoid
colliding with the weekly task), and each of the 23 hours other
than the 0th hour (to avoid colliding with the daily task).

The "Label" value is currently filled with "org.git-scm.git.X"
where X is the frequency. We need a different plist file for each
frequency.

The launchctl command needs to be aligned with a user id in order
to initialize the command environment. This must be done using
the 'launchctl bootstrap' subcommand. This subcommand is new as
of macOS 10.11, which was released in September 2015. Before that
release the 'launchctl load' subcommand was recommended. The best
source of information on this transition I have seen is available
at [2]. The current design does not preclude a future version that
detects the available fatures of 'launchctl' to use the older
commands. However, it is best to rely on the newest version since
Apple might completely remove the deprecated version on short
notice.

[2] https://babodee.wordpress.com/2016/04/09/launchctl-2-0-syntax/

To remove a schedule, we must run 'launchctl bootout' with a valid
plist file. We also need to 'bootout' a task before the 'bootstrap'
subcommand will succeed, if such a task already exists.

The need for a user id requires us to run 'id -u' which works on
POSIX systems but not Windows. Further, the need for fully-qualitifed
path names including $HOME behaves differently in the Git internals and
the external test suite. The $HOME variable starts with "C:\..." instead
of the "/c/..." that is provided by Git in these subcommands. The test
therefore has a prerequisite that we are not on Windows. The cross-
platform logic still allows us to test the macOS logic on a Linux
machine.

We can verify the commands that were run by 'git maintenance start'
and 'git maintenance stop' by injecting a script that writes the
command-line arguments into GIT_TEST_MAINT_SCHEDULER.

An earlier version of this patch accidentally had an opening
"<dict>" tag when it should have had a closing "</dict>" tag. This
was caught during manual testing with actual 'launchctl' commands,
but we do not want to update developers' tasks when running tests.
It appears that macOS includes the "xmllint" tool which can verify
the XML format. This is useful for any system that might contain
the tool, so use it whenever it is available.

We strive to make these tests work on all platforms, but Windows caused
some headaches. In particular, the value of getuid() called by the C
code is not guaranteed to be the same as `$(id -u)` invoked by a test.
This is because `git.exe` is a native Windows program, whereas the
utility programs run by the test script mostly utilize the MSYS2 runtime,
which emulates a POSIX-like environment. Since the purpose of the test
is to check that the input to the hook is well-formed, the actual user
ID is immaterial, thus we can work around the problem by making the the
test UID-agnostic. Another subtle issue is the $HOME environment
variable being a Windows-style path instead of a Unix-style path. We can
be more flexible here instead of expecting exact path matches.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:38:02 -08:00
5a067ba9d0 completion: add proper public __git_complete
When __git_complete was introduced, it was meant to be temporarily, while
a proper guideline for public shell functions was established
(tentatively _GIT_complete), but since that never happened, people
in the wild started to use __git_complete, even though it was marked as
not public.

Eight years is more than enough wait, let's mark this function as
public, and make it a bit more user-friendly.

So that instead of doing:

  __git_complete gk __gitk_main

The user can do:

  __git_complete gk gitk

And instead of:

  __git_complete gf _git_fetch

Do:

  __git_complete gf git_fetch

Backwards compatibility is maintained.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:25:56 -08:00
0e02bdc17a test: completion: add tests for __git_complete
Even though the function was marked as not public, it's already used in
the wild.

We should at least test basic functionality.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:25:56 -08:00
810df0ea8e completion: bash: improve function detection
1. We should quote the argument
 2. We don't need two redirections
 3. A safeguard for arguments (-a) would be good

Suggested-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:25:56 -08:00
7f94b78dda completion: bash: add __git_have_func helper
This makes the code more readable, and also will help when new code
wants to do similar checks.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:25:56 -08:00
fa7ca5d4fe cache-tree: use trace2 in cache_tree_update()
This matches a trace_performance_enter()/trace_performance_leave() pair
added by 0d1ed59 (unpack-trees: add performance tracing, 2018-08-18).

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:23:08 -08:00
c338898a47 unpack-trees: add trace2 regions
The unpack_trees() method is quite complicated and its performance can
change dramatically depending on how it is used. We already have some
performance tracing regions, but they have not been updated to the
trace2 API. Do so now.

We already have trace2 regions in unpack_trees.c:clear_ce_flags(), which
uses a linear scan through the index without recursing into trees.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:23:08 -08:00
da8be8ced6 tree-walk: report recursion counts
The traverse_trees() method recursively walks through trees, but also
prunes the tree-walk based on a callback. Some callers, such as
unpack_trees(), are quite complicated and can have wildly different
performance between two different commands.

Create constants that count these values and then report the results at
the end of a process. These counts are cumulative across multiple "root"
instances of traverse_trees(), but they provide reproducible values for
demonstrating improvements to the pruning algorithm when possible.

This change is modeled after a similar statistics reporting in 42e50e78
(revision.c: add trace2 stats around Bloom filter usage, 2020-04-06).

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:23:08 -08:00
90b666da60 revision: trace topo-walk statistics
We trace statistics about the effectiveness of changed-path Bloom
filters since 42e50e78 (revision.c: add trace2 stats around Bloom
filter usage, 2020-04-06). Add similar tracing for the topo-walk
algorithm that uses generation numbers to limit the walk size.

This information can help investigate and describe benefits to
heuristics and other changes.

The information that is printed is in JSON format and can be formatted
nicely to present as follows:

    {
	"count_explort_walked":2603,
	"count_indegree_walked":2603,
	"count_topo_walked":473
    }

Each of these values count the number of commits are visited by each of
the three "stages" of the topo-walk as detailed in b4542418 (revision.c:
generation-based topo-order algorithm, 2018-11-01).

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:18:22 -08:00
bc62692757 hash-lookup: rename from sha1-lookup
Change all remnants of "sha1" in hash-lookup.c and .h and rename them to
reflect that we're not just able to handle SHA-1 these days.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
7a7d992d0d sha1-lookup: rename sha1_pos() as hash_pos()
Rename this function to reflect that we're not just able to handle SHA-1
these days. There are a few instances of "sha1" left in sha1-lookup.[ch]
after this, but those will be addressed in the next commit.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
e5afd4449d object-file.c: rename from sha1-file.c
Drop the last remnant of "sha1" in this file and rename it to reflect
that we're not just able to handle SHA-1 these days.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
1e6771e504 object-name.c: rename from sha1-name.c
Generalize the last remnants of "sha" and "sha1" in this file and rename
it to reflect that we're not just able to handle SHA-1 these days.

We need to update one test to check for an updated error string.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
350410f6b1 diffcore-rename: remove unnecessary duplicate entry checks
Commit 25d5ea410f ("[PATCH] Redo rename/copy detection logic.",
2005-05-24) added a duplicate entry check on rename_src in order to
avoid segfaults; the code at the time was prone to double free()s and an
easy way to avoid it was just to turn off rename detection for any
duplicate entries.  Note that the form of the check was modified two
commits ago in this series.

Similarly, commit 4d6be03b95 ("diffcore-rename: avoid processing
duplicate destinations", 2015-02-26) added a duplicate entry check
on rename_dst for the exact same reason -- the code was prone to double
free()s, and an easy way to avoid it was just to turn off rename
detection entirely.  Note that the form of the check was modified in the
commit just before this one.

In the original code in both places, the code was dealing with
individual diff_filespecs and trying to match things up, instead of just
keeping the original diff_filepairs around as we do now.  The
intervening change in structure has fixed the accounting problems and
the associated double free()s that used to occur, and thus we already
have a better fix.  As such, we can remove the band-aid checks for
duplicate entries.

Due to the last two patches, the diffcore_rename() setup is no longer a
sizeable chunk of overall runtime.  Thus, in a large rebase of many
commits with lots of renames and several optimizations to inexact rename
detection, this patch only speeds up the overall code by about half a
percent or so and is pretty close to the run-to-run variability making
it hard to get an exact measurement.  However, with some trace2 regions
around the setup code in diffcore_rename() so that I can focus on just
it, I measure that this patch consistently saves almost a third of the
remaining time spent in diffcore_rename() setup.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 12:59:34 -08:00
4ef88fc3a8 merge-ort: add handling for different types of files at same path
Add some handling that explicitly considers collisions of the following
types:
  * file/submodule
  * file/symlink
  * submodule/symlink
Leaving them as conflicts at the same path are hard for users to
resolve, so move one or both of them aside so that they each get their
own path.

Note that in the case of recursive handling (i.e. call_depth > 0), we
can just use the merge base of the two merge bases as the merge result
much like we do with modify/delete conflicts, binary files, conflicting
submodule values, and so on.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
4204cd591b merge-ort: copy find_first_merges() implementation from merge-recursive.c
Code is identical for the function body in the two files, the call
signature is just slightly different in merge-ort than merge-recursive
as noted a couple commits ago.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
70f19c7fce merge-ort: implement format_commit()
This implementation is based on a mixture of print_commit() and
output_commit_title() from merge-recursive.c so that it can be used to
take over both functions.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
c73cda76b1 merge-ort: copy and adapt merge_submodule() from merge-recursive.c
Take merge_submodule() from merge-recursive.c and make slight
adjustments, predominantly around deferring output using path_msg()
instead of using merge-recursive's output() and show() functions.
There's also a fix for recursive cases (when call_depth > 0) and a
slight change to argument order for find_first_merges().

find_first_merges() and format_commit() are left unimplemented for
now, but will be added by subsequent commits.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
f591c47246 merge-ort: copy and adapt merge_3way() from merge-recursive.c
Take merge_3way() from merge-recursive.c and make slight adjustments
based on different data structures (direct usage of object_id
rather diff_filespec, separate pathnames which based on our careful
interning of pathnames in opt->priv->paths can be compared with '!='
rather than 'strcmp').

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
62fdec17a1 merge-ort: flesh out implementation of handle_content_merge()
This implementation is based heavily on merge_mode_and_contents() from
merge-recursive.c, though it has some fixes for recursive merges (i.e.
when call_depth > 0), and has a number of changes throughout based on
slight differences in data structures and in how the functions are
called.

It is, however, based on two new helper functions -- merge_3way() and
merge_submodule -- for which we only provide die-not-implemented stubs
at this point.  Future commits will add implementations of these
functions.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
991bbdcab9 merge-ort: handle book-keeping around two- and three-way content merge
In addition to the content merge (which will go in a subsequent commit),
we need to worry about conflict messages, placing results in higher
order stages in case of a df_conflict, and making sure the results are
placed in ci->merged.result so that they will show up in the working
tree.  Take care of all that external book-keeping, moving the
simplistic just-take-HEAD code into the barebones handle_content_merge()
function for now.  Subsequent commits will flesh out
handle_content_merge().

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
5a1a1e8ea9 merge-ort: implement unique_path() helper
Implement unique_path(), based on the one from merge-recursive.c.  It is
simplified, however, due to: (1) using strmaps, and (2) the fact that
merge-ort lets the checkout codepath handle possible collisions with the
working tree means that other code locations don't have to.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
23366d2aa9 merge-ort: handle directory/file conflicts that remain
When a directory/file conflict remains, we can leave the directory where
it is, but need to move the information about the file to a different
pathname.  After moving the file to a different pathname, we allow
subsequent process_entry() logic to handle any additional details that
might be relevant.

This depends on a new helper function, unique_path(), that dies with an
unimplemented error currently but will be implemented in a subsequent
commit.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
0ccfa4e5d8 merge-ort: handle D/F conflict where directory disappears due to merge
When one side has a directory at a given path and the other side of
history has a file at the path, but the merge resolves the directory
away (e.g. because no path within that directory was modified and the
other side deleted it, or because renaming moved all the files
elsewhere), then we don't actually have a conflict anymore.  We just
need to clear away any information related to the relevant directory,
and then the subsequent process_entry() handling can handle the given
path.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 10:40:45 -08:00
ffd27e6cb2 CoC: explicitly take any whitespace breakage
We'll keep this document mostly in sync with the upstream; let's
help "git am" and "git show" by telling them that they may introduce
what we may consider whitespace errors.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 09:44:49 -08:00
cb50786f49 CoC: Update word-wrapping to match upstream
When the CoC document was added in 5cdf2301d4 (add a Code of Conduct
document, 2019-09-24) it was added from some 1.4 version of the
document whose word wrapping doesn't match what's currently at [1],
which matches content/version/1/4/code-of-conduct.md in the CoC
repository[2].

Let's update our version to match that, to make reading subsequent
diffs easier. There are no non-whitespace changes here.

1. https://www.contributor-covenant.org/version/1/4/code-of-conduct/
2. https://github.com/ContributorCovenant/contributor_covenant

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 09:14:38 -08:00
a9ecaa06a7 core.abbrev=no disables abbreviations
This allows users to write hash-agnostic scripts and configs by
disabling abbreviations.  Using "-c core.abbrev=40" will be
insufficient with SHA-256, and "-c core.abbrev=64" won't work with
SHA-1 repos today.

Signed-off-by: Eric Wong <e@80x24.org>
[jc: tweaked implementation, added doc and a test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-23 13:40:09 -08:00
9ce0fc3311 mktag doc: grammar fix, when exists -> when it exists
Amend the wording of documentation added in 6cfec03680 (mktag:
minimally update the description., 2007-06-10). It makes more sense to
say "when it exists" here, as we're referring to "the message".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-22 17:49:05 -08:00
f59b61dc4d mktag doc: say <hash> not <sha1>
Change the "mktag" documentation to refer to the input hash as just
"hash", not "sha1". This command has supported SHA-256 for a while
now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-22 17:49:05 -08:00
5bc12c11cc t/perf: avoid unnecessary test_export() recursion
test_export() has been self-recursive since its inception even though a
simple for-loop would have served just as well to append its arguments
to the `test_export_` variable separated by the pipe character "|".
Recently `test_export_` was changed instead to a space-separated list of
tokens to be exported, an operation which can be accomplished via a
single simple assignment, with no need for looping or recursion.
Therefore, simplify the implementation.

While at it, take advantage of the fact that variable names to be
exported are shell identifiers, thus won't be composed of special
characters or whitespace, thus simple a `$*` can be used rather than
magical `"$@"`.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-22 13:45:36 -08:00
af04d8f1a5 t4013: add tests for --diff-merges=first-parent
This new option provides essential new functionality, changing diff
output to first parent only, without changing history traversal mode,
so it deserves its own test.

As we do it, add additional test that --diff-merges=first-parent by
itself doesn't imply -p and only outputs diffs for merge commits.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
1d24509b7b doc/git-show: include --diff-merges description
Move description of --diff-merges option from git-log.txt to
diff-options.txt so that it is included in the git-show help.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
e58142add4 doc/rev-list-options: document --first-parent changes merges format
After introduction of the --diff-merges=first-parent, the
--first-parent sets the default format for merges to the same value as
this new option. Document this behavior and add corresponding
reference to --diff-merges.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
8efd2efc32 doc/diff-generate-patch: mention new --diff-merges option
Mention --diff-merges instead of -m in a note to merge formats to aid
discoverability, as -m is now described among --diff-merges options
anyway.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
b5ffa9ec10 doc/git-log: describe new --diff-merges options
Describe all the new --diff-merges options in the git-log.txt and
adopt description of originals accordingly.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
388091fe4d diff-merges: add '--diff-merges=1' as synonym for 'first-parent'
As we now have --diff-merges={m|c|cc}, add --diff-merges=1 as synonym
for --diff-merges=first-parent, to have shorter mnemonics for it as
well.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
5071c75316 diff-merges: add old mnemonic counterparts to --diff-merges
This adds --diff-merges={m|c|cc} values that match mnemonics of old
options, for those who are used to them.

Note that, say, --diff-meres=cc behaves differently than --cc, as the
latter implies -p and therefore enables diffs for all the commits,
while the former enables output of diffs for merge commits only.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
a6d19ecc6b diff-merges: let new options enable diff without -p
New options don't have any visible effect unless -p is either given or
implied, as unlike -c/-cc we don't imply -p with --diff-merges. To fix
this, this patch adds new functionality by letting new options enable
output of diffs for merge commits only.

Add 'merges_need_diff' field and set it whenever diff output for merges is
enabled by any of the new options.

Extend diff output logic accordingly, to output diffs for merges when
'merges_need_diff' is set even when no -p has been provided.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
5733b20f41 diff-merges: do not imply -p for new options
Add 'combined_imply_patch' field and set it only for old --cc/-c
options, then imply -p if this flag is set instead of implying -p
whenever 'combined_merge' flag is set.

We don't want new --diff-merge options to imply -p, to make it
possible to enable output of diffs for merges independently from
non-merge commits. At the same time we want to preserve behavior of
old --c/-c/-m options and their interactions with --first-parent, to
stay backward-compatible.

This patch is first step in this direction: it separates old "--cc/-c
imply -p" logic from the rest of the options.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
8c0ba528bc diff-merges: implement new values for --diff-merges
We first implement new options as exact synonyms for their original
counterparts, to get all the infrastructure right, and keep functional
improvements for later commits.

The following values are implemented:

--diff-merges=	        old equivalent
first|first-parent    = --first-parent (only format implications)
sep|separate          = -m
comb|combined         = -c
dense| dense-combined = --cc

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
255a4dacc5 diff-merges: make -m/-c/--cc explicitly mutually exclusive
-c/--cc got precedence over -m only because of external logic where
corresponding flags are checked before that for -m. This is too
error-prone, so add code that explicitly makes these 3 options
mutually exclusive, so that the last option specified on the
command-line gets precedence.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
3d2b5f2f49 diff-merges: refactor opt settings into separate functions
To prepare introduction of new options some of which will be synonyms
to existing options, let every option handling code just call
corresponding function.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
a6e66af923 diff-merges: get rid of now empty diff_merges_init_revs()
After getting rid of 'ignore_merges' field, the diff_merges_init_revs()
function became empty. Get rid of it.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
d9b1bc6d13 diff-merges: group diff-merge flags next to each other inside 'rev_info'
The relevant flags were somewhat scattered over definition of 'struct
rev_info'. Rearrange them to group them together.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
1a2c4d8050 diff-merges: split 'ignore_merges' field
'ignore_merges' was 3-way field that served two distinct purposes that
we now assign to 2 new independent flags: 'separate_merges', and
'explicit_diff_merges'.

'separate_merges' tells that we need to output diff format containing
separate diff for every parent (as opposed to 'combine_merges').

'explicit_diff_merges' tells that at least one of diff-merges options
has been explicitly specified on the command line, so no defaults
should apply.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
6fc944d895 diff-merges: fix -m to properly override -c/--cc
Logically, -m, -c, --cc specify 3 different formats for representing
merge commits, yet -m doesn't in fact override -c or --cc, that makes
no sense.

Fix -m to properly override -c/--cc, and change the tests accordingly.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
ec315c66bb t4013: add tests for -m failing to override -c/--cc
Logically, -m, -c, --cc specify 3 different formats for representing
merge commits, yet -m doesn't in fact override -c or --cc, that makes
no sense.

Add 2 expected to fail tests that demonstrate the problem.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
14c14b44e4 t4013: support test_expect_failure through ':failure' magic
Add support to be able to specify expected failure, through :failure
magic, like this:

:failure cmd args

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
e121b4b822 diff-merges: revise revs->diff flag handling
Do not set revs->diff when we encounter an option that needs it, as
it'd be impossible to undo later. Besides, some other options than
what we handle here set this flag as well, and we'd interfere with
them trying to clear this flag later.

Rather set revs->diff, if finally needed, in diff_merges_setup_revs().

As an additional bonus, this also makes our code shorter.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
0c627f5d3c diff-merges: handle imply -p on -c/--cc logic for log.c
Move logic that handles implying -p on -c/--cc from
log_setup_revisions_tweak() to diff_merges_setup_revs(), where it
belongs.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
3291eea310 diff-merges: introduce revs->first_parent_merges flag
This new field allows us to separate format of diff for merges from
'first_parent_only' flag which primary purpose is limiting history
traversal.

This change further localizes diff format selection logic into the
diff-merges.c file.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
3b6c17b5c0 diff-merges: new function diff_merges_set_dense_combined_if_unset()
Call it where given functionality is needed instead of direct
checking/tweaking of diff merges related fields.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
09322b1da9 diff-merges: new function diff_merges_suppress()
This function sets all the relevant flags to disabled state, so that
no code that checks only one of them get it wrong.

Then we call this new function everywhere where diff merges output
suppression is needed.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
564a4fc847 diff-merges: re-arrange functions to match the order they are called in
For clarity, define public functions in the order they are called, to
make logic inter-dependencies easier to grok.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
4f54544d73 diff-merges: rename diff_merges_default_to_enable() to match semantics
Rename diff_merges_default_to_enable() to
diff_merges_default_to_first_parent() to match its semantics.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
7acf0d06f5 diff-merges: move checks for first_parent_only out of the module
The checks for first_parent_only don't in fact belong to this module,
as the primary purpose of this flag is history traversal limiting, so
get it out of this module and rename the

diff_merges_first_parent_defaults_to_enable()

to

diff_merges_default_to_enable()

to match new semantics.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
18f09473bf diff-merges: rename all functions to have common prefix
Use the same "diff_merges" prefix for all the diff merges function
names.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
a37eec6333 revision: move diff merges functions to its own diff-merges.c
Create separate diff-merges.c and diff-merges.h files, and move all
the code related to handling of diff merges there.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
3d4fd94363 revision: provide implementation for diff merges tweaks
Use these implementations from show_setup_revisions_tweak() and
log_setup_revisions_tweak() in builtin/log.c.

This completes moving of management of diff merges parameters to a
single place, where we can finally observe them simultaneously.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
027c4783d7 revision: factor out initialization of diff-merge related settings
Move initialization code related to diffing merges into new
init_diff_merge_revs() function.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
299a663440 revision: factor out setup of diff-merge related settings
Move all the setting code related to diffing merges into new
setup_diff_merge_revs() function.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
891e417cbc revision: factor out parsing of diff-merge related options
Move all the parsing code related to diffing merges into new
parse_diff_merge_opts() function.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
cf76baea41 worktree: teach repair to fix multi-directional breakage
`git worktree repair` knows how to repair the two-way links between the
repository and a worktree as long as a link in one or the other
direction is sound. For instance, if a linked worktree is moved (without
using `git worktree move`), repair is possible because the worktree
still knows the location of the repository even though the repository no
longer knows where the worktree is. Similarly, if the repository is
moved, repair is possible since the repository still knows the locations
of the worktrees even though the worktrees no longer know where the
repository is.

However, if both the repository and the worktrees are moved, then links
are severed in both directions, and no repair is possible. This is the
case even when the new worktree locations are specified as arguments to
`git worktree repair`. The reason for this limitation is twofold. First,
when `repair` consults the worktree's gitfile (/path/to/worktree/.git)
to determine the corresponding <repo>/worktrees/<id>/gitdir file to fix,
<repo> is the old path to the repository, thus it is unable to fix the
`gitdir` file at its new location since it doesn't know where it is.
Second, when `repair` consults <repo>/worktrees/<id>/gitdir to find the
location of the worktree's gitfile (/path/to/worktree/.git), the path
recorded in `gitdir` is the old location of the worktree's gitfile, thus
it is unable to repair the gitfile since it doesn't know where it is.

Fix these shortcomings by teaching `repair` to attempt to infer the new
location of the <repo>/worktrees/<id>/gitdir file when the location
recorded in the worktree's gitfile has become stale but the file is
otherwise well-formed. The inference is intentionally simple-minded.
For each worktree path specified as an argument, `git worktree repair`
manually reads the ".git" gitfile at that location and, if it is
well-formed, extracts the <id>. It then searches for a corresponding
<id> in <repo>/worktrees/ and, if found, concludes that there is a
reasonable match and updates <repo>/worktrees/<id>/gitdir to point at
the specified worktree path. In order for <repo> to be known, `git
worktree repair` must be run in the main worktree or bare repository.

`git worktree repair` first attempts to repair each incoming
/path/to/worktree/.git gitfile to point at the repository, and then
attempts to repair outgoing <repo>/worktrees/<id>/gitdir files to point
at the worktrees. This sequence was chosen arbitrarily when originally
implemented since the order of fixes is immaterial as long as one side
of the two-way link between the repository and a worktree is sound.
However, for this new repair technique to work, the order must be
reversed. This is because the new inference mechanism, when it is
successful, allows the outgoing <repo>/worktrees/<id>/gitdir file to be
repaired, thus fixing one side of the two-way link. Once that side is
fixed, the other side can be fixed by the existing repair mechanism,
hence the order of repairs is now significant.

Two safeguards are employed to avoid hijacking a worktree from a
different repository if the user accidentally specifies a foreign
worktree as an argument. The first, as described above, is that it
requires an <id> match between the repository and the worktree. That
itself is not foolproof for preventing hijack, so the second safeguard
is that the inference will only kick in if the worktree's
/path/to/worktree/.git gitfile does not point at a repository.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:44:28 -08:00
8119214f4e merge-ort: implement merge_incore_recursive()
Implement merge_incore_recursive(), mostly through the use of a new
helper function, merge_ort_internal(), which itself is based off
merge_recursive_internal() from merge-recursive.c.

This drops the number of failures in the testsuite when run under
GIT_TEST_MERGE_ALGORITHM=ort from around 1500 to 647.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 21:56:39 -08:00
43e9c4eecc merge-ort: make clear_internal_opts() aware of partial clearing
In order to handle recursive merges, after merging merge-bases we need
to clear away most of the data we had built up but some of it needs to
be kept -- in particular the "output" field.  Rename the function to
reflect its future change in use.

Further, since "reinitialize" means we'll be reusing the fields
immediately, take advantage of this to only partially clear maps,
leaving the hashtable allocated and pre-sized.  (This may be slightly
out-of-order since the speedups aren't realized until there are far
more strmaps in use, but the patch submission process already went out
of order because of various questions and requests for strmap.  Anyway,
see commit 6ccdfc2a20 ("strmap: enable faster clearing and reusing of
strmaps", 2020-11-05), for performance details about the use of
strmap_partial_clear().)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 21:56:39 -08:00
4296d8f17d merge-ort: copy a few small helper functions from merge-recursive.c
In a subsequent commit, we will implement the traditional recursiveness
that gave merge-recursive its name, namely merging non-unique
merge-bases to come up with a single virtual merge base.  Copy a few
helper functions from merge-recursive.c that we will use in the
implementation.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 21:56:39 -08:00
b0ca120554 commit: move reverse_commit_list() from merge-recursive
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 21:56:39 -08:00
c525de335e pull: display default warning only when non-ff
There's no need to display the annoying warning on every pull... only
the ones that are not fast-forward.

The current warning tests still pass, but not because of the arguments
or the configuration, but because they are all fast-forward.

We need to test non-fast-forward situations now.

Suggestions-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:39:42 -08:00
7539fdc629 pull: correct condition to trigger non-ff advice
Refactor the advise() call that teaches users how they can choose
between merge and rebase into a helper function.  This revealed that
the caller's logic needs to be further clarified to allow future
actions (like "erroring out" instead of the current "go ahead and
merge anyway") that should happen whether the advice message is
squelched out.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:39:42 -08:00
b044db9172 pull: get rid of unnecessary global variable
It is easy enough to do, and gives a more descriptive name to the
variable that is scoped in a more focused way.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:39:17 -08:00
6fcccbd755 merge-ort: add implementation of type-changed rename handling
Implement cases where renames are involved in type changes (i.e. the
side of history that didn't rename the file changed its type from a
regular file to a symlink or submodule).  There was some code to handle
this in merge-recursive but only in the special case when the renamed
file had no content changes.  The code here works differently -- it
knows process_entry() can handle mode conflicts, so it does a few
minimal tweaks to ensure process_entry() can just finish the job as
needed.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:18:32 -08:00
f1665e6918 merge-ort: add implementation of normal rename handling
Implement handling of normal renames.  This code replaces the following
from merge-recurisve.c:

  * the code relevant to RENAME_NORMAL in process_renames()
  * the RENAME_NORMAL case of process_entry()

Also, there is some shared code from merge-recursive.c for multiple
different rename cases which we will no longer need for this case (or
other rename cases):

  * handle_rename_normal()
  * setup_rename_conflict_info()

The consolidation of four separate codepaths into one is made possible
by a change in design: process_renames() tweaks the conflict_info
entries within opt->priv->paths such that process_entry() can then
handle all the non-rename conflict types (directory/file, modify/delete,
etc.) orthogonally.  This means we're much less likely to miss special
implementation of some kind of combination of conflict types (see
commits brought in by 66c62eaec6 ("Merge branch 'en/merge-tests'",
2020-11-18), especially commit ef52778708 ("merge tests: expect improved
directory/file conflict handling in ort", 2020-10-26) for more details).
That, together with letting worktree/index updating be handled
orthogonally in the merge_switch_to_result() function, dramatically
simplifies the code for various special rename cases.

(To be fair, the code for handling normal renames wasn't all that
complicated beforehand, but it's still much simpler now.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:18:32 -08:00
35e47e3514 merge-ort: add implementation of rename collisions
Implement rename/rename(2to1) and rename/add handling, i.e. a file is
renamed into a location where another file is added (with that other
file either being a plain add or itself coming from a rename).  Note
that rename collisions can also have a special case stacked on top: the
file being renamed on one side of history is deleted on the other
(yielding either a rename/add/delete conflict or perhaps a
rename/rename(2to1)/delete[/delete]) conflict.

One thing to note here is that when there is a double rename, the code
in question only handles one of them at a time; a later iteration
through the loop will handle the other.  After they've both been
handled, process_entry()'s normal add/add code can handle the collision.

This code replaces the following from merge-recurisve.c:

  * all the 2to1 code in process_renames()
  * the RENAME_TWO_FILES_TO_ONE case of process_entry()
  * handle_rename_rename_2to1()
  * handle_rename_add()

Also, there is some shared code from merge-recursive.c for multiple
different rename cases which we will no longer need for this case (or
other rename cases):

  * handle_file_collision()
  * setup_rename_conflict_info()

The consolidation of six separate codepaths into one is made possible
by a change in design: process_renames() tweaks the conflict_info
entries within opt->priv->paths such that process_entry() can then
handle all the non-rename conflict types (directory/file, modify/delete,
etc.) orthogonally.  This means we're much less likely to miss special
implementation of some kind of combination of conflict types (see
commits brought in by 66c62eaec6 ("Merge branch 'en/merge-tests'",
2020-11-18), especially commit ef52778708 ("merge tests: expect improved
directory/file conflict handling in ort", 2020-10-26) for more details).
That, together with letting worktree/index updating be handled
orthogonally in the merge_switch_to_result() function, dramatically
simplifies the code for various special rename cases.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:18:32 -08:00
2e91ddd24e merge-ort: add implementation of rename/delete conflicts
Implement rename/delete conflicts, i.e. one side renames a file and the
other deletes the file.  This code replaces the following from
merge-recurisve.c:

  * the code relevant to RENAME_DELETE in process_renames()
  * the RENAME_DELETE case of process_entry()
  * handle_rename_delete()

Also, there is some shared code from merge-recursive.c for multiple
different rename cases which we will no longer need for this case (or
other rename cases):

  * handle_change_delete()
  * setup_rename_conflict_info()

The consolidation of five separate codepaths into one is made possible
by a change in design: process_renames() tweaks the conflict_info
entries within opt->priv->paths such that process_entry() can then
handle all the non-rename conflict types (directory/file, modify/delete,
etc.) orthogonally.  This means we're much less likely to miss special
implementation of some kind of combination of conflict types (see
commits brought in by 66c62eaec6 ("Merge branch 'en/merge-tests'",
2020-11-18), especially commit ef52778708 ("merge tests: expect improved
directory/file conflict handling in ort", 2020-10-26) for more details).
That, together with letting worktree/index updating be handled
orthogonally in the merge_switch_to_result() function, dramatically
simplifies the code for various special rename cases.

To be fair, there is a _slight_ tweak to process_entry() here, because
rename/delete cases will also trigger the modify/delete codepath.
However, we only want a modify/delete message to be printed for a
rename/delete conflict if there is a content change in the renamed file
in addition to the rename.  So process_renames() and process_entry()
aren't quite fully orthogonal, but they are pretty close.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:18:32 -08:00
53e88a0353 merge-ort: add implementation of both sides renaming differently
Implement rename/rename(1to2) handling, i.e. both sides of history
renaming a file and rename it differently.  This code replaces the
following from merge-recurisve.c:

  * all the 1to2 code in process_renames()
  * the RENAME_ONE_FILE_TO_TWO case of process_entry()
  * handle_rename_rename_1to2()

Also, there is some shared code from merge-recursive.c for multiple
different rename cases which we will no longer need for this case (or
other rename cases):

  * handle_file_collision()
  * setup_rename_conflict_info()

The consolidation of five separate codepaths into one is made possible
by a change in design: process_renames() tweaks the conflict_info
entries within opt->priv->paths such that process_entry() can then
handle all the non-rename conflict types (directory/file, modify/delete,
etc.) orthogonally.  This means we're much less likely to miss special
implementation of some kind of combination of conflict types (see
commits brought in by 66c62eaec6 ("Merge branch 'en/merge-tests'",
2020-11-18), especially commit ef52778708 ("merge tests: expect improved
directory/file conflict handling in ort", 2020-10-26) for more details).
That, together with letting worktree/index updating be handled
orthogonally in the merge_switch_to_result() function, dramatically
simplifies the code for various special rename cases.

To be fair, there is a _slight_ tweak to process_entry() here to make
sure that the two different paths aren't marked as clean but are left in
a conflicted state.  So process_renames() and process_entry() aren't
quite entirely orthogonal, but they are pretty close.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:18:32 -08:00
af1e56c49e merge-ort: add implementation of both sides renaming identically
Implement rename/rename(1to1) handling, i.e. both sides of history
renaming a file but renaming the same way.  This code replaces the
following from merge-recurisve.c:

  * all the 1to1 code in process_renames()
  * the RENAME_ONE_FILE_TO_ONE case of process_entry()

Also, there is some shared code from merge-recursive.c for multiple
different rename cases which we will no longer need for this case (or
other rename cases):

  * handle_rename_normal()
  * setup_rename_conflict_info()

The consolidation of four separate codepaths into one is made possible
by a change in design: process_renames() tweaks the conflict_info
entries within opt->priv->paths such that process_entry() can then
handle all the non-rename conflict types (directory/file, modify/delete,
etc.) orthogonally.  This means we're much less likely to miss special
implementation of some kind of combination of conflict types (see
commits brought in by 66c62eaec6 ("Merge branch 'en/merge-tests'",
2020-11-18), especially commit ef52778708 ("merge tests: expect improved
directory/file conflict handling in ort", 2020-10-26) for more details).
That, together with letting worktree/index updating be handled
orthogonally in the merge_switch_to_result() function, dramatically
simplifies the code for various special rename cases.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:18:32 -08:00
c3b58472be pack-redundant: gauge the usage before proposing its removal
The subcommand is unusably slow and the reason why nobody reports it
as a performance bug is suspected to be the absense of users.  Let's
show a big message that asks the user to tell us that they still
care about the command when an attempt is made to run the command,
with an escape hatch to override it with a command line option.

In a few releases, we may turn it into an error and keep it for a
few more releases before finally removing it (during the whole time,
the plan to remove it would be interrupted by end user raising hand).

Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 14:30:11 -08:00
9db2ac5616 diffcore-rename: accelerate rename_dst setup
register_rename_src() simply references the passed pair inside
rename_src.  In contrast, add_rename_dst() did something entirely
different for rename_dst.  Instead of copying the passed pair, it made a
copy of the second diff_filespec from the passed pair, referenced it,
and then set the diff_rename_dst.pair field to NULL.  Later, when a
pairing is found, record_rename_pair() allocated a full diff_filepair
via diff_queue() and pointed its src and dst fields at the appropriate
diff_filespecs.  This contrast between register_rename_src() for the
rename_src data structure and add_rename_dst() for the rename_dst data
structure is oddly inconsistent and requires more memory and work than
necessary.  Let's just reference the original diff_filepair in
rename_dst as-is, just as we do with rename_src.  Add a new
rename_dst.is_rename field, since the rename_dst.p field is never NULL
unlike the old rename_dst.pair field.

Taking advantage of this change and the fact that same-named paths will
be adjacent, we can get rid of the sorting of the array and most of the
lookups on it, allowing us to instead just append as we go.  However,
there is one remaining reason to still keep locate_rename_dst():
handling broken pairs (i.e. when break detection is on).  Those are
somewhat rare, but we can set up a simple strintmap to get the map
between the source and the index.  Doing that allows us to still have a
fast lookup without sorting the rename_dst array.  Since the sorting had
been done in a weakly quadratic manner, when many renames are involved
this time could add up.

There is still a strcmp() in add_rename_dst() that I have left in place
to make it easier to verify that the algorithm has the same results.
This strcmp() is there to check for duplicate destination entries (which
was the easiest way at the time to avoid segfaults in the
diffcore-rename code when trees had multiple entries at a given path).
The underlying double free()s are no longer an issue with the new
algorithm, but that can be addressed in a subsequent commit.

This patch is being submitted in a different order than its original
development, but in a large rebase of many commits with lots of renames
and with several optimizations to inexact rename detection, both setup
time and write back to output queue time from diffcore_rename() were
sizeable chunks of overall runtime.  This patch accelerated the setup
time by about 65%, and final write back to the output queue time by
about 50%, resulting in an overall drop of 3.5% on the execution time of
rebasing a few dozen patches.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
b970b4ef62 diffcore-rename: simplify and accelerate register_rename_src()
register_rename_src() took pains to create an array in rename_src which
was sorted by pathname of the contained diff_filepair.  The sorting was
entirely unnecessary since callers pass filepairs to us in sorted
order.  We can simply append to the end of the rename_src array,
speeding up diffcore_rename() setup time.

Also, note that I dropped the return type on the function since it was
unconditionally discarded anyway.

This patch is being submitted in a different order than its original
development, but in a large rebase of many commits with lots of renames
and with several optimizations to inexact rename detection,
diffcore_rename() setup time was a sizeable chunk of overall runtime.
This patch dropped execution time of rebasing 35 commits with lots of
renames by 2% overall.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
ac14de13b2 t4058: explore duplicate tree entry handling in a bit more detail
While creating the last commit, I found a number of other cases where
git would segfault when faced with trees that have duplicate entries.
None of these segfaults are in the diffcore-rename code (they all occur
in cache-tree and unpack-trees).  Further, to my knowledge, no one has
ever been adversely affected by these bugs, and given that it has been
15 years and folks have fixed a few other issues with historical
duplicate entries (as noted in the last commit), I am not sure we will
ever run into anyone having problems with these.  So I am not sure these
are worth fixing, but it doesn't hurt to at least document these
failures in the same test file that is concerned with duplicate tree
entries.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
5c72261c66 t4058: add more tests and documentation for duplicate tree entry handling
Commit 4d6be03b95 ("diffcore-rename: avoid processing duplicate
destinations", 2015-02-26) added t4058 to demonstrate that a workaround
it added to avoid double frees (namely to just turn off rename detection
when trees had duplicate entries) would indeed avoid segfaults.  The
tests, though, give the impression that the expected diffs are "correct"
when in reality they are just "don't segfault, and do something
semi-reasonable under the circumstances".  Add some notes to make this
clearer.

Also, commit 25d5ea410f ("[PATCH] Redo rename/copy detection logic.",
2005-05-24) added a similar workaround to avoid segfaults, but for
rename_src rather than rename_dst.  I do not see any tests in the
testsuite to cover the collision detection of entries limited to the
source side, so add a couple.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
81c4bf0296 diffcore-rename: reduce jumpiness in progress counters
Inexact rename detection works by comparing all sources to all
destinations, computing similarities, and then finding the best matches
among those that are sufficiently similar.

However, it is preceded by exact rename detection that works by
checking if there are files with identical hashes.  If exact renames are
found, we can exclude some files from inexact rename detection.

The inexact rename detection loops over the full set of files, but
immediately skips those for which rename_dst[i].is_rename is true and
thus doesn't compare any sources to that destination.  As such, these
paths shouldn't be included in the progress counter.

For the eagle eyed, this change hints at an actual optimization -- the
first one I presented at Git Merge 2020.  I'll be submitting that
optimization later, once the basic merge-ort algorithm has merged.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
ad8a1be529 diffcore-rename: simplify limit check
diffcore-rename had two different checks of the form

    if ((a < limit || b < limit) &&
        a * b <= limit * limit)

This can be simplified to

    if (st_mult(a, b) <= st_mult(limit, limit))

which makes it clearer how we are checking for overflow, and makes it
much easier to parse given the drop from 8 to 4 variable appearances.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
00b8cccdd8 diffcore-rename: avoid usage of global in too_many_rename_candidates()
too_many_rename_candidates() got the number of rename destinations via
an argument to the function, but the number of rename sources via a
global variable.  That felt rather inconsistent.  Pass in the number of
rename sources as an argument as well.

While we are at it... We had a local variable, num_src, that served two
purposes.  Initially it was set to the global value, but later was used
for counting a subset of the number of sources.  Since we now have a
function argument for the former usage, introduce a clearer variable
name for the latter usage.

This patch has no behavioral changes; it's just renaming and passing an
argument instead of grabbing it from the global namespace.  (You may
find it easier to view the patch using git diff's --color-words option.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
26a66a6b1c diffcore-rename: rename num_create to num_destinations
Our main data structures are rename_src and rename_dst.  For counters of
these data structures, num_sources and num_destinations seem natural;
definitely more so than using num_create for the latter.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
278f4be806 pull: give the advice for choosing rebase/merge much later
Eventually we want to be omit the advice when we can fast-forward
in which case there is no reason to require the user to choose
between rebase or merge.

In order to do so, we need to delay giving the advice up to the
point where we can check if we can fast-forward or not.

Additionally, config_get_rebase() was probably never its true home.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:03:17 -08:00
77a7ec6329 pull: refactor fast-forward check
We would like to be able to make this check before the decision to
rebase is made in a future step.  Besides, using a separate helper
makes the code easier to follow.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 08:59:40 -08:00
c2d267df02 merge-ort: add basic outline for process_renames()
Add code which determines which kind of special rename case each rename
corresponds to, but leave the handling of each type unimplemented for
now.  Future commits will implement each one.

There is some tenuous resemblance to merge-recursive's
process_renames(), but comparing the two is very unlikely to yield any
insights.  merge-ort's process_renames() is a bit complex and I would
prefer if I could simplify it more, but it is far easier to grok than
merge-recursive's function of the same name in my opinion.  Plus,
merge-ort handles more rename conflict types than merge-recursive does.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 08:45:59 -08:00
965a7bc21c merge-ort: implement compare_pairs() and collect_renames()
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 08:45:59 -08:00
f39d05ca26 merge-ort: implement detect_regular_renames()
Based heavily on merge-recursive's get_diffpairs() function, and also
includes the necessary paired call to diff_warn_rename_limit() so that
users will be warned if merge.renameLimit is not sufficiently large for
rename detection to run.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 08:45:59 -08:00
e1a124e8dc merge-ort: add initial outline for basic rename detection
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 08:45:58 -08:00
864075ec43 merge-ort: add basic data structures for handling renames
This will grow later, but we only need a few fields for basic rename
handling.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 08:45:58 -08:00
c5a6f65527 merge-ort: add modify/delete handling and delayed output processing
The focus here is on adding a path_msg() which will queue up
warning/conflict/notice messages about the merge for later processing,
storing these in a pathname -> strbuf map.  It might seem like a big
change, but it really just is:

  * declaration of necessary map with some comments
  * initialization and recording of data
  * a bunch of code to iterate over the map at print/free time
  * at least one caller in order to avoid an error about having an
    unused function (which we provide in the form of implementing
    modify/delete conflict handling).

At this stage, it is probably not clear why I am opting for delayed
output processing.  There are multiple reasons:

  1. Merges are supposed to abort if they would overwrite dirty changes
     in the working tree.  We cannot correctly determine whether changes
     would be overwritten until both rename detection has occurred and
     full processing of entries with the renames has finalized.
     Warning/conflict/notice messages come up at intermediate codepaths
     along the way, so unless we want spurious conflict/warning messages
     being printed when the merge will be aborted anyway, we need to
     save these messages and only print them when relevant.

  2. There can be multiple messages for a single path, and we want all
     messages for a give path to appear together instead of having them
     grouped by conflict/warning type.  This was a problem already with
     merge-recursive.c but became even more important due to the
     splitting apart of conflict types as discussed in the commit
     message for 1f3c9ba707 ("t6425: be more flexible with rename/delete
     conflict messages", 2020-08-10)

  3. Some callers might want to avoid showing the output in certain
     cases, such as if the end result is a clean merge.  Rebases have
     typically done this.

  4. Some callers might not want the output to go to stdout or even
     stderr, but might want to do something else with it entirely.
     For example, a --remerge-diff option to `git show` or `git log
     -p` that remerges on the fly and diffs merge commits against the
     remerged version would benefit from stdout/stderr not being
     written to in the standard form.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:38:47 -08:00
e2e9dc030c merge-ort: add die-not-implemented stub handle_content_merge() function
This simplistic and weird-looking patch is here to facilitate future
patch submissions.  Adding this stub allows rename detection code to
reference it in one patch series, while a separate patch series can
define the implementation, and then both series can merge cleanly and
work nicely together at that point.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:38:47 -08:00
04af1879b9 merge-ort: add function grouping comments
Commit b658536f59 ("merge-ort: add some high-level algorithm structure",
2020-10-27) added high-level structure of the ort merge algorithm.  As
we have added more and more functions, that high-level structure has
been slightly obscured.  Since functions are still grouped according to
this high-level structure, add comments denoting sections where all the
functions are specifically tied to a piece of the high-level structure.

This function groupings include a few sub-divisions of the original
high-level structure, including some sub-divisions that are yet to be
submitted.  Each has (or will have) several functions all serving as
helpers to one or two main functions for each section.

As an added bonus, the comments will serve to provide a small textual
separation between nearby sections and allow the next three patch series
to be submitted independently and merge cleanly.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:38:47 -08:00
43c1dccb91 merge-ort: add a paths_to_free field to merge_options_internal
This field will be used in future patches to allow removal of paths from
opt->priv->paths.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:38:47 -08:00
1c7873cdf4 merge-ort: add a path_conflict field to merge_options_internal
This field is not yet used, but will be used by both the rename handling
code, and the conflict type handling code in process_entry().

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:38:40 -08:00
101bc5bc2d merge-ort: add a clear_internal_opts helper
Move most of merge_finalize() into a new helper function,
clear_internal_opts().  This is a step to facilitate recursive merges,
as well as some future optimizations.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:21:03 -08:00
67845745c1 merge-ort: add a few includes
Include blob.h for definition of blob_type, and commit-reach.h for
declarations of get_merge_bases() and in_merge_bases().  While none of
these are used yet, we want to avoid cross-dependencies in the next
three series of patches for merge-ort and merge them at the end; adding
these "#include"s now avoids textual conflicts.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:21:03 -08:00
89422d29b1 merge-ort: free data structures in merge_finalize()
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
ef2b369387 merge-ort: add implementation of record_conflicted_index_entries()
After checkout(), the working tree has the appropriate contents, and the
index matches the working copy.  That means that all unmodified and
cleanly merged files have correct index entries, but conflicted entries
need to be updated.

We do this by looping over the conflicted entries, marking the existing
index entry for the path with CE_REMOVE, adding new higher order staged
for the path at the end of the index (ignoring normal index sort order),
and then at the end of the loop removing the CE_REMOVED-marked cache
entries and sorting the index.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
70912f66de tree: enable cmp_cache_name_compare() to be used elsewhere
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
6681ce5cf6 merge-ort: add implementation of checkout()
Since merge-ort creates a tree for its output, when there are no
conflicts, updating the working tree and index is as simple as using the
unpack_trees() machinery with a twoway_merge (i.e. doing the equivalent
of a "checkout" operation).

If there were conflicts in the merge, then since the tree we created
included all the conflict markers, then using the unpack_trees machinery
in this manner will still update the working tree correctly.  Further,
all index entries corresponding to cleanly merged files will also be
updated correctly by this procedure.  Index entries corresponding to
conflicted entries will appear as though the user had run "git add -u"
after the merge to accept all files as-is with conflict markers.

Thus, after running unpack_trees(), there needs to be a separate step
for updating the entries in the index corresponding to conflicted files.
This will be the job for the function record_conflicted_index_entris(),
which will be implemented in a subsequent commit.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
9fefce68dc merge-ort: basic outline for merge_switch_to_result()
This adds a basic implementation for merge_switch_to_result(), though
just in terms of a few new empty functions that will be defined in
subsequent commits.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
bb470f4e13 merge-ort: step 3 of tree writing -- handling subdirectories as we go
Our order for processing of entries means that if we have a tree of
files that looks like
   Makefile
   src/moduleA/foo.c
   src/moduleA/bar.c
   src/moduleB/baz.c
   src/moduleB/umm.c
   tokens.txt

Then we will process paths in the order of the leftmost column below.  I
have added two additional columns that help explain the algorithm that
follows; the 2nd column is there to remind us we have oid & mode info we
are tracking for each of these paths (which differs between the paths
which I'm not representing well here), and the third column annotates
the parent directory of the entry:
   tokens.txt               <version_info>    ""
   src/moduleB/umm.c        <version_info>    src/moduleB
   src/moduleB/baz.c        <version_info>    src/moduleB
   src/moduleB              <version_info>    src
   src/moduleA/foo.c        <version_info>    src/moduleA
   src/moduleA/bar.c        <version_info>    src/moduleA
   src/moduleA              <version_info>    src
   src                      <version_info>    ""
   Makefile                 <version_info>    ""

When the parent directory changes, if it's a subdirectory of the previous
parent directory (e.g. "" -> src/moduleB) then we can just keep appending.
If the parent directory differs from the previous parent directory and is
not a subdirectory, then we should process that directory.

So, for example, when we get to this point:
   tokens.txt               <version_info>    ""
   src/moduleB/umm.c        <version_info>    src/moduleB
   src/moduleB/baz.c        <version_info>    src/moduleB

and note that the next entry (src/moduleB) has a different parent than
the last one that isn't a subdirectory, we should write out a tree for it
   100644 blob <HASH> umm.c
   100644 blob <HASH> baz.c

then pop all the entries under that directory while recording the new
hash for that directory, leaving us with
   tokens.txt               <version_info>        ""
   src/moduleB              <new version_info>    src

This process repeats until at the end we get to
   tokens.txt               <version_info>        ""
   src                      <new version_info>    ""
   Makefile                 <version_info>        ""

and then we can write out the toplevel tree.  Since we potentially have
entries in our string_list corresponding to multiple different toplevel
directories, e.g. a slightly different repository might have:
   whizbang.txt             <version_info>        ""
   tokens.txt               <version_info>        ""
   src/moduleD              <new version_info>    src
   src/moduleC              <new version_info>    src
   src/moduleB              <new version_info>    src
   src/moduleA/foo.c        <version_info>        src/moduleA
   src/moduleA/bar.c        <version_info>        src/moduleA

When src/moduleA is popped off, we need to know that the "last
directory" reverts back to src, and how many entries in our string_list
are associated with that parent directory.  So I use an auxiliary
offsets string_list which would have (parent_directory,offset)
information of the form
   ""             0
   src            2
   src/moduleA    5

Whenever I write out a tree for a subdirectory, I set versions.nr to
the final offset value and then decrement offsets.nr...and then add
an entry to versions with a hash for the new directory.

The idea is relatively simple, there's just a lot of accounting to
implement this.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
ee4012dcf9 merge-ort: step 2 of tree writing -- function to create tree object
Create a new function, write_tree(), which will take a list of
basenames, modes, and oids for a single directory and create a tree
object in the object-store.  We do not yet have just basenames, modes,
and oids for just a single directory (we have a mixture of entries from
all directory levels in the hierarchy) so we still die() before the
current call to write_tree(), but the next patch will rectify that.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
a9945bba60 merge-ort: step 1 of tree writing -- record basenames, modes, and oids
As a step towards transforming the processed path->conflict_info entries
into an actual tree object, start recording basenames, modes, and oids
in a dir_metadata structure.  Subsequent commits will make use of this
to actually write a tree.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
8adffaa818 merge-ort: have process_entries operate in a defined order
We want to handle paths below a directory before needing to handle the
directory itself.  Also, we want to handle the directory immediately
after the paths below it, so we can't use simple lexicographic ordering
from strcmp (which would insert foo.txt between foo and foo/file.c).
Copy string_list_df_name_compare() from merge-recursive.c, and set up a
string list of paths sorted by that function so that we can iterate in
the desired order.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
6a02dd90c9 merge-ort: add a preliminary simple process_entries() implementation
Add a process_entries() implementation that just loops over the paths
and processes each one individually with an auxiliary process_entry()
call.  Add a basic process_entry() as well, which handles several cases
but leaves a few of the more involved ones with die-not-implemented
messages.  Also, although process_entries() is supposed to create a
tree, it does not yet have code to do so -- except in the special case
of merging completely empty trees.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
291f29caf6 merge-ort: avoid recursing into identical trees
When all three trees have the same oid, there is no need to recurse into
these trees to find that all files within them happen to match.  We can
just record any one of the trees as the resolution of merging that
particular path.

Immediately resolving trees for other types of trivial tree merges (such
as one side matches the merge base, or the two sides match each other)
would prevent us from detecting renames for some paths, and thus prevent
us from doing three-way content merges for those paths whose renames we
did not detect.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
98bf984167 merge-ort: record stage and auxiliary info for every path
Create a helper function, setup_path_info(), which can be used to record
all the information we want in a merged_info or conflict_info.  While
there is currently only one caller of this new function, and some of its
particular parameters are fixed, future callers of this function will be
added later.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
34e557af54 merge-ort: compute a few more useful fields for collect_merge_info
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
885f0063e9 merge-ort: avoid repeating fill_tree_descriptor() on the same tree
Three-way merges, by their nature, are going to often have two or more
trees match at a given subdirectory.  We can avoid calling
fill_tree_descriptor() on the same tree by checking when these trees
match.  Noting when various oids match will also be useful in other
calculations and optimizations as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:20 -08:00
d2bc1994f3 merge-ort: implement a very basic collect_merge_info()
This does not actually collect any necessary info other than the
pathnames involved, since it just allocates an all-zero conflict_info
and stuffs that into paths.  However, it invokes the traverse_trees()
machinery to walk over all the paths and sets up the basic
infrastructure we need.

I have left out a few obvious optimizations to try to make this patch as
short and obvious as possible.  A subsequent patch will add some of
those back in with some more useful data fields before we introduce a
patch that actually sets up the conflict_info fields.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:19 -08:00
0c0d705b5c merge-ort: add an err() function similar to one from merge-recursive
Various places in merge-recursive used an err() function when it hit
some kind of unrecoverable error.  That code was from the reusable bits
of merge-recursive.c that we liked, such as merge_3way, writing object
files to the object store, reading blobs from the object store, etc.  So
create a similar function to allow us to port that code over, and use it
for when we detect problems returned from collect_merge_info()'s
traverse_trees() call, which we will be adding next.

While we are at it, also add more documentation for the "clean" field
from struct merge_result, particularly since the name suggests a boolean
but it is not quite one and this is our first non-boolean usage.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:19 -08:00
c8017176ac merge-ort: use histogram diff
In my cursory investigation, histogram diffs are about 2% slower than
Myers diffs.  Others have probably done more detailed benchmarks.  But,
in short, histogram diffs have been around for years and in a number of
cases provide obviously better looking diffs where Myers diffs are
unintelligible but the performance hit has kept them from becoming the
default.

However, there are real merge bugs we know about that have triggered on
git.git and linux.git, which I don't have a clue how to address without
the additional information that I believe is provided by histogram
diffs.  See the following:

https://lore.kernel.org/git/20190816184051.GB13894@sigill.intra.peff.net/
https://lore.kernel.org/git/CABPp-BHvJHpSJT7sdFwfNcPn_sOXwJi3=o14qjZS3M8Rzcxe2A@mail.gmail.com/
https://lore.kernel.org/git/CABPp-BGtez4qjbtFT1hQoREfcJPmk9MzjhY5eEq1QhXT23tFOw@mail.gmail.com/

I don't like mismerges.  I really don't like silent mismerges.  While I
am sometimes willing to make performance and correctness tradeoff, I'm
much more interested in correctness in general.  I want to fix the above
bugs.  I have not yet started doing so, but I believe histogram diff at
least gives me an angle.  Unfortunately, I can't rely on using the
information from histogram diff unless it's in use.  And it hasn't been
used because of a few percentage performance hit.

In testcases I have looked at, merge-ort is _much_ faster than
merge-recursive for non-trivial merges/rebases/cherry-picks.  As such,
this is a golden opportunity to switch out the underlying diff algorithm
(at least the one used by the merge machinery; git-diff and git-log are
separate questions); doing so will allow me to get additional data and
improved diffs, and I believe it will help me fix the above bugs at some
point in the future.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:19 -08:00
e4171b1b6d merge-ort: port merge_start() from merge-recursive
merge_start() basically does a bunch of sanity checks, then allocates
and initializes opt->priv -- a struct merge_options_internal.

Most of the sanity checks are usable as-is.  The
allocation/intialization is a bit different since merge-ort has a very
different merge_options_internal than merge-recursive, but the idea is
the same.

The weirdest part here is that merge-ort and merge-recursive use the
same struct merge_options, even though merge_options has a number of
fields that are oddly specific to merge-recursive's internal
implementation and don't even make sense with merge-ort's high-level
design (e.g. buffer_output, which merge-ort has to always do).  I reused
the same data structure because:
  * most the fields made sense to both merge algorithms
  * making a new struct would have required making new enums or somehow
    externalizing them, and that was getting messy.
  * it simplifies converting the existing callers by not having to
    have different code paths for merge_options setup.

I also marked detect_renames as ignored.  We can revisit that later, but
in short: merge-recursive allowed turning off rename detection because
it was sometimes glacially slow.  When you speed something up by a few
orders of magnitude, it's worth revisiting whether that justification is
still relevant.  Besides, if folks find it's still too slow, perhaps
they have a better scaling case than I could find and maybe it turns up
some more optimizations we can add.  If it still is needed as an option,
it is easy to add later.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:19 -08:00
231e2dd49d merge-ort: add some high-level algorithm structure
merge_ort_nonrecursive_internal() will be used by both
merge_inmemory_nonrecursive() and merge_inmemory_recursive(); let's
focus on it for now.  It involves some setup -- merge_start() --
followed by the following chain of functions:

  collect_merge_info()
    This function will populate merge_options_internal's paths field,
    via a call to traverse_trees() and a new callback that will be added
    later.

  detect_and_process_renames()
    This function will detect renames, and then adjust entries in paths
    to move conflict stages from old pathnames into those for new
    pathnames, so that the next step doesn't have to think about renames
    and just can do three-way content merging and such.

  process_entries()
    This function determines how to take the various stages (versions of
    a file from the three different sides) and merge them, and whether
    to mark the result as conflicted or cleanly merged.  It also writes
    out these merged file versions as it goes to create a tree.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:19 -08:00
5b59c3db05 merge-ort: setup basic internal data structures
Set up some basic internal data structures.  The only carry-over from
merge-recursive.c is call_depth, though needed_rename_limit will be
added later.

The central piece of data will definitely be the strmap "paths", which
will map every relevant pathname under consideration to either a
merged_info or a conflict_info.  ("conflicted" is a strmap that is a
subset of "paths".)

merged_info contains all relevant information for a non-conflicted
entry.  conflict_info contains a merged_info, plus any additional
information about a conflict such as the higher orders stages involved
and the names of the paths those came from (handy once renames get
involved).  If an entry remains conflicted, the merged_info portion of a
conflict_info will later be filled with whatever version of the file
should be placed in the working directory (e.g. an as-merged-as-possible
variation that contains conflict markers).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 14:18:19 -08:00
fac60b8925 rev-parse: add option for absolute or relative path formatting
git rev-parse has several options which print various paths.  Some of
these paths are printed relative to the current working directory, and
some are absolute.

Normally, this is not a problem, but there are times when one wants
paths entirely in one format or another.  This can be done trivially if
the paths are canonical, but canonicalizing paths is not possible on
some shell scripting environments which lack realpath(1) and also in Go,
which lacks functions that properly canonicalize paths on Windows.

To help out the scripter, let's provide an option which turns most of
the paths printed by git rev-parse to be either relative to the current
working directory or absolute and canonical.  Document which options are
affected and which are not so that users are not confused.

This approach is cleaner and tidier than providing duplicates of
existing options which are either relative or absolute.

Note that if the user needs both forms, it is possible to pass an
additional option in the middle of the command line which changes the
behavior of subsequent operations.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-12 23:35:51 -08:00
be6e0daee7 abspath: add a function to resolve paths with missing components
Currently, we have a function to resolve paths, strbuf_realpath.  This
function canonicalizes paths like realpath(3), but permits a trailing
component to be absent from the file system.  In other words, this is
the behavior of the GNU realpath(1) without any arguments.

In the future, we'll need this same behavior, except that we want to
allow for any number of missing trailing components, which is the
behavior of GNU realpath(1) with the -m option.  This is useful because
we'll want to canonicalize a path that may point to a not yet present
path under the .git directory.  For example, a user may want to know
where an arbitrary ref would be stored if it existed in the file system.

Let's refactor strbuf_realpath to move most of the code to an internal
function and then pass it two flags to control its behavior.  We'll add
a strbuf_realpath_forgiving function that has our new behavior, and
leave strbuf_realpath with the older, stricter behavior.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-12 23:35:47 -08:00
2762e17117 pretty format %(trailers) doc: avoid repetition
Change the documentation for the various %(trailers) options so it
isn't repeating part of the documentation for "only" about how boolean
values are handled. Instead, let's split the description of that into
general documentation at the top.

It then suffices to refer to it by listing the options as
"opt[=<BOOL>]". I'm also changing it to upper-case "[=<BOOL>]" from
"[=val]" for consistency with "<SEP>"

It took me a couple of readings to realize that these options were
referring back to the "only" option's treatment of boolean
values.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09 14:16:42 -08:00
058761f1c1 pretty format %(trailers): add a "key_value_separator"
Add a "key_value_separator" option to the "%(trailers)" pretty format,
to go along with the existing "separator" argument. In combination
these two options make it trivial to produce machine-readable (e.g. \0
and \0\0-delimited) format output.

As elaborated on in a previous commit which added "keyonly" it was
needlessly tedious to extract structured data from "%(trailers)"
before the addition of this "key_value_separator" option. As seen by
the test being added here extracting this data now becomes trivial.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09 14:16:42 -08:00
9d87d5ae02 pretty format %(trailers): add a "keyonly"
Add support for a "keyonly". This allows for easier parsing out of the
key and value. Before if you didn't want to make assumptions about how
the key was formatted. You'd need to parse it out as e.g.:

    --pretty=format:'%H%x00%(trailers:separator=%x00%x00)' \
                       '%x00%(trailers:separator=%x00%x00,valueonly)'

And then proceed to deduce keys by looking at those two and
subtracting the value plus the hardcoded ": " separator from the
non-valueonly %(trailers) line. Now it's possible to simply do:

    --pretty=format:'%H%x00%(trailers:separator=%x00%x00,keyonly)' \
                    '%x00%(trailers:separator=%x00%x00,valueonly)'

Which at least reduces it to a state machine where you get N keys and
correlate them with N values. Even better would be to have a way to
change the ": " delimiter to something easily machine-readable (a key
might contain ": " too). A follow-up change will add support for that.

I don't really have a use-case for just "keyonly" myself. I suppose it
would be useful in some cases as "key=*" matches case-insensitively,
so a plain "keyonly" will give you the variants of the keys you
matched. I'm mainly adding it to fix the inconsistency with
"valueonly".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09 14:16:42 -08:00
8b966a0506 pretty-format %(trailers): fix broken standalone "valueonly"
Fix %(trailers:valueonly) being a noop due to on overly eager
optimization in format_trailer_info() which skips custom formatting if
no custom options are given.

When "valueonly" was added in d9b936db52 (pretty: add support for
"valueonly" option in %(trailers), 2019-01-28) we forgot to add it to
the list of options that optimization checks for. See e.g. the
addition of "key" in 250bea0c16 (pretty: allow showing specific
trailers, 2019-01-28) for a similar change where this wasn't missed.

Thus the "valueonly" option in "%(trailers:valueonly)" was a noop and
the output was equivalent to that of a plain "%(trailers)". This
wasn't caught because the tests for it always combined it with other
options.

Fix the bug by adding !opts->value_only to the list. I initially
attempted to make this more future-proof by setting a flag if we got
to ":" in "%(trailers:" in format_commit_one() in pretty.c. However,
"%(trailers:" is also parsed in trailers_atom_parser() in
ref-filter.c.

There is an outstanding patch[1] unify those two, and such a fix, or
other future-proofing, such as changing "process_trailer_options"
flags into a bitfield, would conflict with that effort. Let's instead
do the bare minimum here as this aspect of trailers is being actively
worked on by another series.

Let's also test for a plain "valueonly" without any other options, as
well as "separator". All the other existing options on the pretty.c
path had tests where they were the only option provided. I'm also
keeping a sanity test for "%(trailers:)" being the same as
"%(trailers)". There's no reason to suspect it wouldn't be in the
current implementation, but let's keep it in the interest of black box
testing.

1. https://lore.kernel.org/git/pull.726.git.1599335291.gitgitgadget@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09 14:16:42 -08:00
f077b0a986 pack-bitmap-write: better reuse bitmaps
If the old bitmap file contains a bitmap for a given commit, then that
commit does not need help from intermediate commits in its history to
compute its final bitmap. Eject that commit from the walk and insert it
into a separate list of reusable commits that are eventually stored in
the list of commits for computing bitmaps.

This helps the repeat bitmap computation task, even if the selected
commits shift drastically. This helps when a previously-bitmapped commit
exists in the first-parent history of a newly-selected commit. Since we
stop the walk at these commits and we use a first-parent walk, it is
harder to walk "around" these bitmapped commits. It's not impossible,
but we can greatly reduce the computation time for many selected
commits.

             |   runtime (sec)    |   peak heap (GB)   |
             |                    |                    |
             |   from  |   with   |   from  |   with   |
             | scratch | existing | scratch | existing |
  -----------+---------+----------+---------+-----------
  last patch |  88.478 |   53.218 |   2.157 |    2.224 |
  this patch |  86.681 |   16.164 |   2.157 |    2.222 |

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:49:07 -08:00
45f4eeb291 pack-bitmap-write: relax unique revwalk condition
The previous commits improved the bitmap computation process for very
long, linear histories with many refs by removing quadratic growth in
how many objects were walked. The strategy of computing "intermediate
commits" using bitmasks for which refs can reach those commits
partitioned the poset of reachable objects so each part could be walked
exactly once. This was effective for linear histories.

However, there was a (significant) drawback: wide histories with many
refs had an explosion of memory costs to compute the commit bitmasks
during the exploration that discovers these intermediate commits. Since
these wide histories are unlikely to repeat walking objects, the benefit
of walking objects multiple times was not expensive before. But now, the
commit walk *before computing bitmaps* is incredibly expensive.

In an effort to discover a happy medium, this change reduces the walk
for intermediate commits to only the first-parent history. This focuses
the walk on how the histories converge, which still has significant
reduction in repeat object walks. It is still possible to create
quadratic behavior in this version, but it is probably less likely in
realistic data shapes.

Here is some data taken on a fresh clone of the kernel:

             |   runtime (sec)    |   peak heap (GB)   |
             |                    |                    |
             |   from  |   with   |   from  |   with   |
             | scratch | existing | scratch | existing |
  -----------+---------+----------+---------+-----------
    original |  64.044 |   83.241 |   2.088 |    2.194 |
  last patch |  45.049 |   37.624 |   2.267 |    2.334 |
  this patch |  88.478 |   53.218 |   2.157 |    2.224 |

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:49:07 -08:00
341fa34887 pack-bitmap-write: use existing bitmaps
When constructing new bitmaps, we perform a commit and tree walk in
fill_bitmap_commit() and fill_bitmap_tree(). This walk would benefit
from using existing bitmaps when available. We must track the existing
bitmaps and translate them into the new object order, but this is
generally faster than parsing trees.

In fill_bitmap_commit(), we must reorder thing somewhat. The priority
queue walks commits from newest-to-oldest, which means we correctly stop
walking when reaching a commit with a bitmap. However, if we walk trees
interleaved with the commits, then we might be parsing trees that are
actually part of a re-used bitmap. To avoid over-walking trees, add them
to a LIFO queue and walk them after exploring commits completely.

On git.git, this reduces a second immediate bitmap computation from 2.0s
to 1.0s. On linux.git, we go from 32s to 22s. On chromium's fork
network, we go from 227s to 198s.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:49:06 -08:00
83578051a9 pack-bitmap: factor out 'add_commit_to_bitmap()'
'find_objects()' currently needs to interact with the bitmaps khash
pretty closely. To make 'find_objects()' read a little more
straightforwardly, remove some of the khash-level details into a new
function that describes what it does: 'add_commit_to_bitmap()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:49:06 -08:00
98c31f366a pack-bitmap: factor out 'bitmap_for_commit()'
A couple of callers within pack-bitmap.c duplicate logic to lookup a
given object id in the bitamps khash. Factor this out into a new
function, 'bitmap_for_commit()' to reduce some code duplication.

Make this new function non-static, since it will be used in later
commits from outside of pack-bitmap.c.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:49:04 -08:00
449fa5ee06 pack-bitmap-write: ignore BITMAP_FLAG_REUSE
The on-disk bitmap format has a flag to mark a bitmap to be "reused".
This is a rather curious feature, and works like this:

  - a run of pack-objects would decide to mark the last 80% of the
    bitmaps it generates with the reuse flag

  - the next time we generate bitmaps, we'd see those reuse flags from
    the last run, and mark those commits as special:

      - we'd be more likely to select those commits to get bitmaps in
        the new output

      - when generating the bitmap for a selected commit, we'd reuse the
        old bitmap as-is (rearranging the bits to match the new pack, of
        course)

However, neither of these behaviors particularly makes sense.

Just because a commit happened to be bitmapped last time does not make
it a good candidate for having a bitmap this time. In particular, we may
choose bitmaps based on how recent they are in history, or whether a ref
tip points to them, and those things will change. We're better off
re-considering fresh which commits are good candidates.

Reusing the existing bitmap _is_ a reasonable thing to do to save
computation. But only reusing exact bitmaps is a weak form of this. If
we have an old bitmap for A and now want a new bitmap for its child, we
should be able to compute that only by looking at trees and that are new
to the child. But this code would consider only exact reuse (which is
perhaps why it was eager to select those commits in the first place).

Furthermore, the recent switch to the reverse-edge algorithm for
generating bitmaps dropped this optimization entirely (and yet still
performs better).

So let's do a few cleanups:

 - drop the whole "reusing bitmaps" phase of generating bitmaps. It's
   not helping anything, and is mostly unused code (or worse, code that
   is using CPU but not doing anything useful)

 - drop the use of the on-disk reuse flag to select commits to bitmap

 - stop setting the on-disk reuse flag in bitmaps we generate (since
   nothing respects it anymore)

We will keep a few innards of the reuse code, which will help us
implement a more capable version of the "reuse" optimization:

 - simplify rebuild_existing_bitmaps() into a function that only builds
   the mapping of bits between the old and new orders, but doesn't
   actually convert any bitmaps

 - make rebuild_bitmap() public; we'll call it lazily to convert bitmaps
   as we traverse (using the mapping created above)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:17 -08:00
089f751360 pack-bitmap-write: build fewer intermediate bitmaps
The bitmap_writer_build() method calls bitmap_builder_init() to
construct a list of commits reachable from the selected commits along
with a "reverse graph". This reverse graph has edges pointing from a
commit to other commits that can reach that commit. After computing a
reachability bitmap for a commit, the values in that bitmap are then
copied to the reachability bitmaps across the edges in the reverse
graph.

We can now relax the role of the reverse graph to greatly reduce the
number of intermediate reachability bitmaps we compute during this
reverse walk. The end result is that we walk objects the same number of
times as before when constructing the reachability bitmaps, but we also
spend much less time copying bits between bitmaps and have much lower
memory pressure in the process.

The core idea is to select a set of "important" commits based on
interactions among the sets of commits reachable from each selected commit.

The first technical concept is to create a new 'commit_mask' member in the
bb_commit struct. Note that the selected commits are provided in an
ordered array. The first thing to do is to mark the ith bit in the
commit_mask for the ith selected commit. As we walk the commit-graph, we
copy the bits in a commit's commit_mask to its parents. At the end of
the walk, the ith bit in the commit_mask for a commit C stores a boolean
representing "The ith selected commit can reach C."

As we walk, we will discover non-selected commits that are important. We
will get into this later, but those important commits must also receive
bit positions, growing the width of the bitmasks as we walk. At the true
end of the walk, the ith bit means "the ith _important_ commit can reach
C."

MAXIMAL COMMITS
---------------

We use a new 'maximal' bit in the bb_commit struct to represent whether
a commit is important or not. The term "maximal" comes from the
partially-ordered set of commits in the commit-graph where C >= P if P
is a parent of C, and then extending the relationship transitively.
Instead of taking the maximal commits across the entire commit-graph, we
instead focus on selecting each commit that is maximal among commits
with the same bits on in their commit_mask. This definition is
important, so let's consider an example.

Suppose we have three selected commits A, B, and C. These are assigned
bitmasks 100, 010, and 001 to start. Each of these can be marked as
maximal immediately because they each will be the uniquely maximal
commit that contains their own bit. Keep in mind that that these commits
may have different bitmasks after the walk; for example, if B can reach
C but A cannot, then the final bitmask for C is 011. Even in these
cases, C would still be a maximal commit among all commits with the
third bit on in their masks.

Now define sets X, Y, and Z to be the sets of commits reachable from A,
B, and C, respectively. The intersections of these sets correspond to
different bitmasks:

 * 100: X - (Y union Z)
 * 010: Y - (X union Z)
 * 001: Z - (X union Y)
 * 110: (X intersect Y) - Z
 * 101: (X intersect Z) - Y
 * 011: (Y intersect Z) - X
 * 111: X intersect Y intersect Z

This can be visualized with the following Hasse diagram:

	100    010    001
         | \  /   \  / |
         |  \/     \/  |
         |  /\     /\  |
         | /  \   /  \ |
        110    101    011
          \___  |  ___/
              \ | /
               111

Some of these bitmasks may not be represented, depending on the topology
of the commit-graph. In fact, we are counting on it, since the number of
possible bitmasks is exponential in the number of selected commits, but
is also limited by the total number of commits. In practice, very few
bitmasks are possible because most commits converge on a common "trunk"
in the commit history.

With this three-bit example, we wish to find commits that are maximal
for each bitmask. How can we identify this as we are walking?

As we walk, we visit a commit C. Since we are walking the commits in
topo-order, we know that C is visited after all of its children are
visited. Thus, when we get C from the revision walk we inspect the
'maximal' property of its bb_data and use that to determine if C is truly
important. Its commit_mask is also nearly final. If C is not one of the
originally-selected commits, then assign a bit position to C (by
incrementing num_maximal) and set that bit on in commit_mask. See
"MULTIPLE MAXIMAL COMMITS" below for more detail on this.

Now that the commit C is known to be maximal or not, consider each
parent P of C. Compute two new values:

 * c_not_p : true if and only if the commit_mask for C contains a bit
             that is not contained in the commit_mask for P.

 * p_not_c : true if and only if the commit_mask for P contains a bit
             that is not contained in the commit_mask for P.

If c_not_p is false, then P already has all of the bits that C would
provide to its commit_mask. In this case, move on to other parents as C
has nothing to contribute to P's state that was not already provided by
other children of P.

We continue with the case that c_not_p is true. This means there are
bits in C's commit_mask to copy to P's commit_mask, so use bitmap_or()
to add those bits.

If p_not_c is also true, then set the maximal bit for P to one. This means
that if no other commit has P as a parent, then P is definitely maximal.
This is because no child had the same bitmask. It is important to think
about the maximal bit for P at this point as a temporary state: "P is
maximal based on current information."

In contrast, if p_not_c is false, then set the maximal bit for P to
zero. Further, clear all reverse_edges for P since any edges that were
previously assigned to P are no longer important. P will gain all
reverse edges based on C.

The final thing we need to do is to update the reverse edges for P.
These reverse edges respresent "which closest maximal commits
contributed bits to my commit_mask?" Since C contributed bits to P's
commit_mask in this case, C must add to the reverse edges of P.

If C is maximal, then C is a 'closest' maximal commit that contributed
bits to P. Add C to P's reverse_edges list.

Otherwise, C has a list of maximal commits that contributed bits to its
bitmask (and this list is exactly one element). Add all of these items
to P's reverse_edges list. Be careful to ignore duplicates here.

After inspecting all parents P for a commit C, we can clear the
commit_mask for C. This reduces the memory load to be limited to the
"width" of the commit graph.

Consider our ABC/XYZ example from earlier and let's inspect the state of
the commits for an interesting bitmask, say 011. Suppose that D is the
only maximal commit with this bitmask (in the first three bits). All
other commits with bitmask 011 have D as the only entry in their
reverse_edges list. D's reverse_edges list contains B and C.

COMPUTING REACHABILITY BITMAPS
------------------------------

Now that we have our definition, let's zoom out and consider what
happens with our new reverse graph when computing reachability bitmaps.
We walk the reverse graph in reverse-topo-order, so we visit commits
with largest commit_masks first. After we compute the reachability
bitmap for a commit C, we push the bits in that bitmap to each commit D
in the reverse edge list for C. Then, when we finally visit D we already
have the bits for everything reachable from maximal commits that D can
reach and we only need to walk the objects in the set-difference.

In our ABC/XYZ example, when we finally walk for the commit A we only
need to walk commits with bitmask equal to A's bitmask. If that bitmask
is 100, then we are only walking commits in X - (Y union Z) because the
bitmap already contains the bits for objects reachable from (X intersect
Y) union (X intersect Z) (i.e. the bits from the reachability bitmaps
for the maximal commits with bitmasks 110 and 101).

The behavior is intended to walk each commit (and the trees that commit
introduces) at most once while allocating and copying fewer reachability
bitmaps. There is one caveat: what happens when there are multiple
maximal commits with the same bitmask, with respect to the initial set
of selected commits?

MULTIPLE MAXIMAL COMMITS
------------------------

Earlier, we mentioned that when we discover a new maximal commit, we
assign a new bit position to that commit and set that bit position to
one for that commit. This is absolutely important for interesting
commit-graphs such as git/git and torvalds/linux. The reason is due to
the existence of "butterflies" in the commit-graph partial order.

Here is an example of four commits forming a butterfly:

   I    J
   |\  /|
   | \/ |
   | /\ |
   |/  \|
   M    N
    \  /
     |/
     Q

Here, I and J both have parents M and N. In general, these do not need
to be exact parent relationships, but reachability relationships. The
most important part is that M and N cannot reach each other, so they are
independent in the partial order. If I had commit_mask 10 and J had
commit_mask 01, then M and N would both be assigned commit_mask 11 and
be maximal commits with the bitmask 11. Then, what happens when M and N
can both reach a commit Q? If Q is also assigned the bitmask 11, then it
is not maximal but is reachable from both M and N.

While this is not necessarily a deal-breaker for our abstract definition
of finding maximal commits according to a given bitmask, we have a few
issues that can come up in our larger picture of constructing
reachability bitmaps.

In particular, if we do not also consider Q to be a "maximal" commit,
then we will walk commits reachable from Q twice: once when computing
the reachability bitmap for M and another time when computing the
reachability bitmap for N. This becomes much worse if the topology
continues this pattern with multiple butterflies.

The solution has already been mentioned: each of M and N are assigned
their own bits to the bitmask and hence they become uniquely maximal for
their bitmasks. Finally, Q also becomes maximal and thus we do not need
to walk its commits multiple times. The final bitmasks for these commits
are as follows:

  I:10       J:01
   |\        /|
   | \ _____/ |
   | /\____   |
   |/      \  |
   M:111    N:1101
        \  /
       Q:1111

Further, Q's reverse edge list is { M, N }, while M and N both have
reverse edge list { I, J }.

PERFORMANCE MEASUREMENTS
------------------------

Now that we've spent a LOT of time on the theory of this algorithm,
let's show that this is actually worth all that effort.

To test the performance, use GIT_TRACE2_PERF=1 when running
'git repack -abd' in a repository with no existing reachability bitmaps.
This avoids any issues with keeping existing bitmaps to skew the
numbers.

Inspect the "building_bitmaps_total" region in the trace2 output to
focus on the portion of work that is affected by this change. Here are
the performance comparisons for a few repositories. The timings are for
the following versions of Git: "multi" is the timing from before any
reverse graph is constructed, where we might perform multiple
traversals. "reverse" is for the previous change where the reverse graph
has every reachable commit.  Finally "maximal" is the version introduced
here where the reverse graph only contains the maximal commits.

      Repository: git/git
           multi: 2.628 sec
         reverse: 2.344 sec
         maximal: 2.047 sec

      Repository: torvalds/linux
           multi: 64.7 sec
         reverse: 205.3 sec
         maximal: 44.7 sec

So in all cases we've not only recovered any time lost to switching to
the reverse-edge algorithm, but we come out ahead of "multi" in all
cases. Likewise, peak heap has gone back to something reasonable:

      Repository: torvalds/linux
           multi: 2.087 GB
         reverse: 3.141 GB
         maximal: 2.288 GB

While I do not have access to full fork networks on GitHub, Peff has run
this algorithm on the chromium/chromium fork network and reported a
change from 3 hours to ~233 seconds. That network is particularly
beneficial for this approach because it has a long, linear history along
with many tags. The "multi" approach was obviously quadratic and the new
approach is linear.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:17 -08:00
c6b0c3910c pack-bitmap.c: check reads more aggressively when loading
Before 'load_bitmap_entries_v1()' reads an actual EWAH bitmap, it should
check that it can safely do so by ensuring that there are at least 6
bytes available to be read (four for the commit's index position, and
then two more for the xor offset and flags, respectively).

Likewise, it should check that the commit index it read refers to a
legitimate object in the pack.

The first fix catches a truncation bug that was exposed when testing,
and the second is purely precautionary.

There are some possible future improvements, not pursued here. They are:

  - Computing the correct boundary of the bitmap itself in the caller
    and ensuring that we don't read past it. This may or may not be
    worth it, since in a truncation situation, all bets are off: (is the
    trailer still there and the bitmap entries malformed, or is the
    trailer truncated?). The best we can do is try to read what's there
    as if it's correct data (and protect ourselves when it's obviously
    bogus).

  - Avoid the magic "6" by teaching read_be32() and read_u8() (both of
    which are custom helpers for this function) to check sizes before
    advancing the pointers.

  - Adding more tests in this area. Testing these truncation situations
    are remarkably fragile to even subtle changes in the bitmap
    generation. So, the resulting tests are likely to be quite brittle.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:17 -08:00
928e3f42ad pack-bitmap-write: rename children to reverse_edges
The bitmap_builder_init() method walks the reachable commits in
topological order and constructs a "reverse graph" along the way. At the
moment, this reverse graph contains an edge from commit A to commit B if
and only if A is a parent of B. Thus, the name "children" is appropriate
for for this reverse graph.

In the next change, we will repurpose the reverse graph to not be
directly-adjacent commits in the commit-graph, but instead a more
abstract relationship. The previous changes have already incorporated
the necessary updates to fill_bitmap_commit() that allow these edges to
not be immediate children.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:17 -08:00
1467b9572a t5310: add branch-based checks
The current rev-list tests that check the bitmap data only work on HEAD
instead of multiple branches. Expand the test cases to handle both
'master' and 'other' branches.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:17 -08:00
597b2c39af commit: implement commit_list_contains()
It can be helpful to check if a commit_list contains a commit. Use
pointer equality, assuming lookup_commit() was used.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:16 -08:00
ed03a58b65 bitmap: implement bitmap_is_subset()
The bitmap_is_subset() function checks if the 'self' bitmap contains any
bitmaps that are not on in the 'other' bitmap. Up until this patch, it
had a declaration, but no implementation or callers. A subsequent patch
will want this function, so implement it here.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:16 -08:00
6dc5ef759f pack-bitmap-write: fill bitmap with commit history
The current implementation of bitmap_writer_build() creates a
reachability bitmap for every walked commit. After computing a bitmap
for a commit, those bits are pushed to an in-progress bitmap for its
children.

fill_bitmap_commit() assumes the bits corresponding to objects
reachable from the parents of a commit are already set. This means that
when visiting a new commit, we only have to walk the objects reachable
between it and any of its parents.

A future change to bitmap_writer_build() will relax this condition so
not all parents have their bits set. Prepare for that by having
'fill_bitmap_commit()' walk parents until reaching commits whose bits
are already set. Then, walk the trees for these commits as well.

This has no functional change with the current implementation of
bitmap_writer_build().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:16 -08:00
010e5eacfb pack-bitmap-write: pass ownership of intermediate bitmaps
Our algorithm to generate reachability bitmaps walks through the commit
graph from the bottom up, passing bitmap data from each commit to its
descendants. For a linear stretch of history like:

  A -- B -- C

our sequence of steps is:

  - compute the bitmap for A by walking its trees, etc

  - duplicate A's bitmap as a starting point for B; we can now free A's
    bitmap, since we only needed it as an intermediate result

  - OR in any extra objects that B can reach into its bitmap

  - duplicate B's bitmap as a starting point for C; likewise, free B's
    bitmap

  - OR in objects for C, and so on...

Rather than duplicating bitmaps and immediately freeing the original, we
can just pass ownership from commit to commit. Note that this doesn't
always work:

  - the recipient may be a merge which already has an intermediate
    bitmap from its other ancestor. In that case we have to OR our
    result into it. Note that the first ancestor to reach the merge does
    get to pass ownership, though.

  - we may have multiple children; we can only pass ownership to one of
    them

However, it happens often enough and copying bitmaps is expensive enough
that this provides a noticeable speedup. On a clone of linux.git, this
reduces the time to generate bitmaps from 205s to 70s. This is about the
same amount of time it took to generate bitmaps using our old "many
traversals" algorithm (the previous commit measures the identical
scenario as taking 63s). It unfortunately provides only a very modest
reduction in the peak memory usage, though.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:16 -08:00
4a9c581729 pack-bitmap-write: reimplement bitmap writing
The bitmap generation code works by iterating over the set of commits
for which we plan to write bitmaps, and then for each one performing a
traditional traversal over the reachable commits and trees, filling in
the bitmap. Between two traversals, we can often reuse the previous
bitmap result as long as the first commit is an ancestor of the second.
However, our worst case is that we may end up doing "n" complete
complete traversals to the root in order to create "n" bitmaps.

In a real-world case (the shared-storage repo consisting of all GitHub
forks of chromium/chromium), we perform very poorly: generating bitmaps
takes ~3 hours, whereas we can walk the whole object graph in ~3
minutes.

This commit completely rewrites the algorithm, with the goal of
accessing each object only once. It works roughly like this:

  - generate a list of commits in topo-order using a single traversal

  - invert the edges of the graph (so have parents point at their
    children)

  - make one pass in reverse topo-order, generating a bitmap for each
    commit and passing the result along to child nodes

We generate correct results because each node we visit has already had
all of its ancestors added to the bitmap. And we make only two linear
passes over the commits.

We also visit each tree usually only once. When filling in a bitmap, we
don't bother to recurse into trees whose bit is already set in the
bitmap (since we know we've already done so when setting their bit).
That means that if commit A references tree T, none of its descendants
will need to open T again. I say "usually", though, because it is
possible for a given tree to be mentioned in unrelated parts of history
(e.g., cherry-picking to a parallel branch).

So we've accomplished our goal, and the resulting algorithm is pretty
simple to understand. But there are some downsides, at least with this
initial implementation:

  - we no longer reuse the results of any on-disk bitmaps when
    generating. So we'd expect to sometimes be slower than the original
    when bitmaps already exist. However, this is something we'll be able
    to add back in later.

  - we use much more memory. Instead of keeping one bitmap in memory at
    a time, we're passing them up through the graph. So our memory use
    should scale with the graph width (times the size of a bitmap).

So how does it perform?

For a clone of linux.git, generating bitmaps from scratch with the old
algorithm took 63s. Using this algorithm it takes 205s. Which is much
worse, but _might_ be acceptable if it behaved linearly as the size
grew. It also increases peak heap usage by ~1G. That's not impossibly
large, but not encouraging.

On the complete fork-network of torvalds/linux, it increases the peak
RAM usage by 40GB. Yikes. (I forgot to record the time it took, but the
memory usage was too much to consider this reasonable anyway).

On the complete fork-network of chromium/chromium, I ran out of memory
before succeeding. Some back-of-the-envelope calculations indicate it
would need 80+GB to complete.

So at this stage, we've managed to make things much worse. But because
of the way this new algorithm is structured, there are a lot of
opportunities for optimization on top. We'll start implementing those in
the follow-on patches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:16 -08:00
ccae08e822 ewah: add bitmap_dup() function
There's no easy way to make a copy of a bitmap. Obviously a caller can
iterate over the bits and set them one by one in a new bitmap, but we
can go much faster by copying whole words with memcpy().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:16 -08:00
3ed675101a ewah: implement bitmap_or()
We have a function to bitwise-OR an ewah into an uncompressed bitmap,
but not to OR two uncompressed bitmaps. Let's add it.

Interestingly, we have a public header declaration going back to
e1273106f6 (ewah: compressed bitmap implementation, 2013-11-14), but the
function was never implemented. That was all OK since there were no
users of 'bitmap_or()', but a first caller will be added in a couple of
patches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:16 -08:00
2e2d141afd ewah: make bitmap growth less aggressive
If you ask to set a bit in the Nth word and we haven't yet allocated
that many slots in our array, we'll increase the bitmap size to 2*N.
This means we might frequently end up with bitmaps that are twice the
necessary size (as soon as you ask for the biggest bit, we'll size up to
twice that).

But if we just allocate as many words as were asked for, we may not grow
fast enough. The worst case there is setting bit 0, then 1, etc. Each
time we grow we'd just extend by one more word, giving us linear
reallocations (and quadratic memory copies).

A middle ground is relying on alloc_nr(), which causes us to grow by a
factor of roughly 3/2 instead of 2. That's less aggressive than
doubling, and it may help avoid fragmenting memory. (If we start with N,
then grow twice, our total is N*(3/2)^2 = 9N/4. After growing twice,
that array of size 9N/4 can fit into the space vacated by the original
array and first growth, N+3N/2 = 10N/4 > 9N/4, leading to less
fragmentation in memory).

Our worst case is still 3/2N wasted bits (you set bit N-1, then setting
bit N causes us to grow by 3/2), but our average should be much better.

This isn't usually that big a deal, but it will matter as we shift the
reachability bitmap generation code to store more bitmaps in memory.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:16 -08:00
d574bf43e8 ewah: factor out bitmap growth
We auto-grow bitmaps when somebody asks to set a bit whose position is
outside of our currently allocated range. Other operations besides
single bit-setting might need to do this, too, so let's pull it into its
own function.

Note that we change the semantics a little: you now ask for the number
of words you'd like to have, not the id of the block you'd like to write
to.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:16 -08:00
2978b00691 rev-list: die when --test-bitmap detects a mismatch
You can use "git rev-list --test-bitmap HEAD" to check that bitmaps
produce the same answer we'd get from a regular traversal. But if we
detect an error, we only print "mismatch", and still exit with a
successful error code.

That makes the uses of --test-bitmap in the test suite (e.g., in t5310)
mostly pointless: even if we saw an error, the tests wouldn't notice.
Let's instead call die(), which will let these tests work as designed,
and alert us if the bitmaps are bogus.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:15 -08:00
c5cd749076 t5310: drop size of truncated ewah bitmap
We truncate the .bitmap file to 512 bytes and expect to run into
problems reading an individual ewah file. But this length is somewhat
arbitrary, and just happened to work when the test was added in
9d2e330b17 (ewah_read_mmap: bounds-check mmap reads, 2018-06-14).

An upcoming commit will change the size of the history we create in the
test repo, which will cause this test to fail. We can future-proof it a
bit more by reducing the size of the truncated bitmap file.

Signed-off-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:15 -08:00
ec6c7b4367 pack-bitmap: bounds-check size of cache extension
A .bitmap file may have a "name hash cache" extension, which puts a
sequence of uint32_t values (one per object) at the end of the file.
When we see a flag indicating this extension, we blindly subtract the
appropriate number of bytes from our available length. However, if the
.bitmap file is too short, we'll underflow our length variable and wrap
around, thinking we have a very large length. This can lead to reading
out-of-bounds bytes while loading individual ewah bitmaps.

We can fix this by checking the number of available bytes when we parse
the header. The existing "truncated bitmap" test is now split into two
tests: one where we don't have this extension at all (and hence actually
do try to read a truncated ewah bitmap) and one where we realize
up-front that we can't even fit in the cache structure. We'll check
stderr in each case to make sure we hit the error we're expecting.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:15 -08:00
ca51090200 pack-bitmap: fix header size check
When we parse a .bitmap header, we first check that we have enough bytes
to make a valid header. We do that based on sizeof(struct
bitmap_disk_header). However, as of 0f4d6cada8 (pack-bitmap: make bitmap
header handling hash agnostic, 2019-02-19), that struct oversizes its
checksum member to GIT_MAX_RAWSZ. That means we need to adjust for the
difference between that constant and the size of the actual hash we're
using. That commit adjusted the code which moves our pointer forward,
but forgot to update the size check.

This meant we were overly strict about the header size (requiring room
for a 32-byte worst-case hash, when sha1 is only 20 bytes). But in
practice it didn't matter because bitmap files tend to have at least 12
bytes of actual data anyway, so it was unlikely for a valid file to be
caught by this.

Let's fix it by pulling the header size into a separate variable and
using it in both spots. That fixes the bug and simplifies the code to make
it harder to have a mismatch like this in the future. It will also come
in handy in the next patch for more bounds checking.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:15 -08:00
3b1ca60f8f ewah/ewah_bitmap.c: avoid open-coding ALLOC_GROW()
'ewah/ewah_bitmap.c:buffer_grow()' is responsible for growing the buffer
used to store the bits of an EWAH bitmap. It is essentially doing the
same task as the 'ALLOC_GROW()' macro, so use that instead.

This simplifies the callers of 'buffer_grow()', who no longer have to
ask for a specific size, but rather specify how much of the buffer they
need. They also no longer need to guard 'buffer_grow()' behind an if
statement, since 'ALLOC_GROW()' (and, by extension, 'buffer_grow()') is
a noop if the buffer is already large enough.

But, the most significant change is that this fixes a bug when calling
buffer_grow() with both 'alloc_size' and 'new_size' set to 1. In this
case, truncating integer math will leave the new size set to 1, causing
the buffer to never grow.

Instead, let alloc_nr() handle this, which asks for '(new_size + 16) * 3
/ 2' instead of 'new_size * 3 / 2'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:15 -08:00
8ef9312464 diff: do not show submodule with untracked files as "-dirty"
Git diff reports a submodule directory as -dirty even when there are
only untracked files in the submodule directory. This is inconsistent
with what `git describe --dirty` says when run in the submodule
directory in that state.

Make `--ignore-submodules=untracked` the default for `git diff` when
there is no configuration variable or command line option, so that the
command would not give '-dirty' suffix to a submodule whose working
tree has untracked files, to make it consistent with `git
describe --dirty` that is run in the submodule working tree.

And also make `--ignore-submodules=none` the default for `git status`
so that the user doesn't end up deleting a submodule that has
uncommitted (untracked) files.

Signed-off-by: Sangeeta Jain <sangunb09@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:27:35 -08:00
7c1f79fc16 pretty format %(trailers) test: split a long line
Split a very long line in a test introduced in 0b691d8685 (pretty:
add support for separator option in %(trailers), 2019-01-28). This
makes it easier to read, especially as follow-up commits will copy
this test as a template.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-07 10:23:11 -08:00
16c5690929 maintenance: include 'cron' details in docs
Advanced and expert users may want to know how 'git maintenance start'
schedules background maintenance in order to customize their own
schedules beyond what the maintenance.* config values allow. Start a new
set of sections in git-maintenance.txt that describe how 'cron' is used
to run these tasks.

This is particularly valuable for users who want to inspect what Git is
doing or for users who want to customize the schedule further. Having a
baseline can provide a way forward for users who have never worked with
cron schedules.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-24 13:02:29 -08:00
31345d5545 maintenance: extract platform-specific scheduling
The existing schedule mechanism using 'cron' is supported by POSIX
platforms, but not Windows. It also works slightly differently on
macOS to significant detriment of the user experience. To allow for
new implementations on these platforms, extract a method that
performs the platform-specific scheduling mechanism. This will be
swapped at compile time with new implementations on specialized
platforms.

As we add this generality, rename GIT_TEST_CRONTAB to
GIT_TEST_MAINT_SCHEDULER. Further, this variable is now parsed as
"<scheduler>:<command>" so we can test platform-specific scheduling
logic even when not on the correct platform. By specifying the
<scheduler> in this string, we will be able to test all three sets of
Git logic from a Linux machine.

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-24 13:02:29 -08:00
8b70966aa9 tests: drop prereq PREPARE_FOR_MAIN_BRANCH where no longer needed
We introduced the `PREPARE_FOR_MAIN_BRANCH` prereq for the sole purpose
of allowing us to perform the non-trivial adjustments regarding the
`master` -> `main` rename before the automatable ones.

Now that the transition is almost complete, we can stop using it in most
instances. The only two exceptions are t5526 and t9902: at the time of
writing, there are other patches in flight that touch these test
scripts, therefore their transition to `main` is postponed to a later
date.

This patch is the result of this command:

	sed -i 's/PREPARE_FOR_MAIN_BRANCH[ ,]//' t/t[0-9]*.sh &&
	git checkout HEAD -- t/t5526\* t/t9902\*

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
8dcf73c5c9 t99*: adjust the references to the default branch name "main"
Carefully excluding t9902, which sees independent development elsewhere
at the time of writing, we use `main` as the default branch name in
t9903. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t99*.sh lib-cvs.sh &&
	   git checkout HEAD -- t9902\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for all tests (except the ones we specifically excluded for now).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
46a29020bb tests(git-p4): transition to the default branch name main
In the previous commits, we adjusted the test suite to use the branch
name `main` for initial branches.

The `git p4`-related tests are a bit harder to adjust because `git p4`
uses the ref `refs/heads/p4/master` to track the remote branches, and
for now, we do not want to change that (this might be the subject of a
future patch series). We only need to adjust for the actual initial
branch name to be changed to `main`.

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
765577b5d0 t9[5-7]*: adjust the references to the default branch name "main"
This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t9[5-7]*.sh lib-cvs.sh)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
a881baa2c3 t9[0-4]*: adjust the references to the default branch name "main"
This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t9[0-4]*.sh)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
747f6c6805 t8*: adjust the references to the default branch name "main"
This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t8*.sh annotate*.sh)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
1e2ae142c0 t7[5-9]*: adjust the references to the default branch name "main"
Excluding t7817, which is added in an unrelated patch series at the time
of writing, this adjusts t7[5-9]*. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t7[5-9]*.sh)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
01dc81336d t7[0-4]*: adjust the references to the default branch name "main"
Carefully excluding t7064, which sees independent development elsewhere
at the time of writing, we use `main` as the default branch name in
t7[0-4]*. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t7[0-4]*.sh &&
	   git checkout HEAD -- t7064\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
5902f5f460 t6[4-9]*: adjust the references to the default branch name "main"
This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t6[4-9]*.sh)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
1f53df54eb t64*: preemptively adjust alignment to prepare for master -> main
We are in the process of renaming the default branch name to `main`,
which is two characters shorter than `master`. Therefore, some lines
need to be adjusted in t6416, t6422 and t6427 that want to align text
involving the default branch name.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
1550bb6ed0 t6[0-3]*: adjust the references to the default branch name "main"
Carefully excluding t6300, which sees independent development elsewhere
at the time of writing, we use `main` as the default branch name in
t6[0-3]*. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t6[0-3]*.sh &&
	   git checkout HEAD -- t6300\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
95cf2c0187 t5[6-9]*: adjust the references to the default branch name "main"
This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t5[6-9]*.sh)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
028cb644ec t55[4-9]*: adjust the references to the default branch name "main"
This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -e 's/retsam/niam/g' \
		-- t55[4-9]*.sh t556x*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Note that t5541 uses the reversed `master` name: `retsam`. We replace it
by the equivalent for `main`: `niam`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
3ac8f6301e t55[23]*: adjust the references to the default branch name "main"
Carefully excluding t5526, which sees independent development elsewhere
at the time of writing, we use `main` as the default branch name in
t55[23]*. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -e 's/naster/nain/g' -- \
		t55[23]*.sh &&
	   git checkout HEAD -- t5526\*)

Note that t5533 contains a variation of the name `master` (`naster`)
that we rename here, too.

This commit allows us to define
`GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for that range of tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
bc925ce3f3 t551*: adjust the references to the default branch name "main"
This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t551*.sh)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
3275f4e886 t550*: adjust the references to the default branch name "main"
This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t550*.sh)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
e4010de9f0 t5503: prepare aligned comment for replacing master with main
In an upcoming commit, we will use `main` as the default branch name in
t5503 instead of `master`. This will require extra padding in ASCII-art
commit graphs, which we hereby add preemptively.

By doing this preemptively rather than after the commit applying the
search-and-replace, it is more obvious that we caught all aligned
comments that are affected by the latter commit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
966b4be276 t5[0-4]*: adjust the references to the default branch name "main"
Carefully excluding t5310, which is developed independently of the
current patch series at the time of writing, we now use `main` as
default branch in t5[0-4]*. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t5[0-4]*.sh &&
	   git checkout HEAD -- t5310\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
4b071211e6 t5323: prepare centered comment for master -> main
We are about to search-and-replace all mentions of `master` in t5323 by
`main`, which is two characters shorter. To prepare for that, let's add
padding to centered lines that will make them briefly uncentered, but
will be re-centered in the commit that performs that rename.

Doing it this way (instead of padding after replacing) makes it easier
to verify the validity of the patch that replaces `master` by `main`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
8f37854b18 t4*: adjust the references to the default branch name "main"
Carefully excluding t4013 and t4015, which see independent development
elsewhere at the time of writing, we use `main` as the default branch
name in t4*. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t4*.sh t4211/*.export &&
	   git checkout HEAD -- t4013\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
cbc75a12f0 t3[5-9]*: adjust the references to the default branch name "main"
This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t3[5-9]*.sh)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
d1c02d93b3 t34*: adjust the references to the default branch name "main"
Carefully excluding t3404, which sees independent development elsewhere
at the time of writing, we use `main` as the default branch name in
t34*. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t34*.sh &&
	   git checkout HEAD -- t34\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
ba766eebee t3416: preemptively adjust alignment in a comment
We are about to adjust t3416 for the new default branch name `main`.
This name is two characters shorter and therefore needs two spaces more
padding to align correctly.

Adjusting the alignment before the big search-and-replace makes it
easier to verify that the final result does not leave any misaligned
lines behind.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
d6c6b10817 t3[0-3]*: adjust the references to the default branch name "main"
Carefully excluding t3040, which sees independent development elsewhere
at the time of writing, we transition above-mentioned tests to the
default branch name `main`. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t3[0-3]*.sh t3206/* &&
	   git checkout HEAD -- t3040\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
883b98efad t2*: adjust the references to the default branch name "main"
Carefully excluding t2106, which sees independent development elsewhere
at the time of writing, we transition above-mentioned tests to the
default branch name `main`. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t2*.sh &&
	   git checkout HEAD -- t2106\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
06d531486e t[01]*: adjust the references to the default branch name "main"
Carefully excluding t1309, which sees independent development elsewhere
at the time of writing, we transition above-mentioned tests to the
default branch name `main`. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -e 's/naster/nain/g' -- t[01]*.sh &&
	   git checkout HEAD -- t1309\*)

Note that t5533 contains a variation of the name `master` (`naster`)
that we rename here, too.

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
c2fdc8820c t0060: preemptively adjust alignment
We are about to adjust t0060 for the new default branch name `main`.
This name is two characters shorter and therefore needs two spaces more
padding to align correctly.

Adjusting the alignment before the big search-and-replace makes it
easier to verify that the final result does not leave any misaligned
lines behind.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:17 -08:00
334afbc76f tests: mark tests relying on the current default for init.defaultBranch
In addition to the manual adjustment to let the `linux-gcc` CI job run
the test suite with `master` and then with `main`, this patch makes sure
that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts
that currently rely on the initial branch name being `master by default.

To determine which test scripts to mark up, the first step was to
force-set the default branch name to `master` in

- all test scripts that contain the keyword `master`,

- t4211, which expects `t/t4211/history.export` with a hard-coded ref to
  initialize the default branch,

- t5560 because it sources `t/t556x_common` which uses `master`,

- t8002 and t8012 because both source `t/annotate-tests.sh` which also
  uses `master`)

This trick was performed by this command:

	$ sed -i '/^ *\. \.\/\(test-lib\|lib-\(bash\|cvs\|git-svn\)\|gitweb-lib\)\.sh$/i\
	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
	' $(git grep -l master t/t[0-9]*.sh) \
	t/t4211*.sh t/t5560*.sh t/t8002*.sh t/t8012*.sh

After that, careful, manual inspection revealed that some of the test
scripts containing the needle `master` do not actually rely on a
specific default branch name: either they mention `master` only in a
comment, or they initialize that branch specificially, or they do not
actually refer to the current default branch. Therefore, the
aforementioned modification was undone in those test scripts thusly:

	$ git checkout HEAD -- \
		t/t0027-auto-crlf.sh t/t0060-path-utils.sh \
		t/t1011-read-tree-sparse-checkout.sh \
		t/t1305-config-include.sh t/t1309-early-config.sh \
		t/t1402-check-ref-format.sh t/t1450-fsck.sh \
		t/t2024-checkout-dwim.sh \
		t/t2106-update-index-assume-unchanged.sh \
		t/t3040-subprojects-basic.sh t/t3301-notes.sh \
		t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \
		t/t3436-rebase-more-options.sh \
		t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \
		t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \
		t/t5511-refspec.sh t/t5526-fetch-submodules.sh \
		t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \
		t/t5548-push-porcelain.sh \
		t/t5552-skipping-fetch-negotiator.sh \
		t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \
		t/t5614-clone-submodules-shallow.sh \
		t/t7508-status.sh t/t7606-merge-custom.sh \
		t/t9302-fast-import-unpack-limit.sh

We excluded one set of test scripts in these commands, though: the range
of `git p4` tests. The reason? `git p4` stores the (foreign) remote
branch in the branch called `p4/master`, which is obviously not the
default branch. Manual analysis revealed that only five of these tests
actually require a specific default branch name to pass; They were
modified thusly:

	$ sed -i '/^ *\. \.\/lib-git-p4\.sh$/i\
	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
	' t/t980[0167]*.sh t/t9811*.sh

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:17 -08:00
fced6d171e Merge 'jk/diff-release-filespec-fix' into js/default-branch-name-tests-final-stretch
* jk/diff-release-filespec-fix:
  t7800: simplify difftool test
  diff: allow passing NULL to diff_free_filespec_data()
2020-11-19 15:27:59 -08:00
6d37ca2165 Merge branch 'en/strmap' into en/merge-ort-impl
* en/strmap:
  shortlog: use strset from strmap.h
  Use new HASHMAP_INIT macro to simplify hashmap initialization
  strmap: take advantage of FLEXPTR_ALLOC_STR when relevant
  strmap: enable allocations to come from a mem_pool
  strmap: add a strset sub-type
  strmap: split create_entry() out of strmap_put()
  strmap: add functions facilitating use as a string->int map
  strmap: enable faster clearing and reusing of strmaps
  strmap: add more utility functions
  strmap: new utility functions
  hashmap: provide deallocation function names
  hashmap: introduce a new hashmap_partial_clear()
  hashmap: allow re-use after hashmap_free()
  hashmap: adjust spacing to fix argument alignment
  hashmap: add usage documentation explaining hashmap_free[_entries]()
2020-11-11 12:56:29 -08:00
1446 changed files with 204772 additions and 124845 deletions

View File

@ -2,8 +2,15 @@ env:
CIRRUS_CLONE_DEPTH: 1
freebsd_12_task:
env:
GIT_PROVE_OPTS: "--timer --jobs 10"
GIT_TEST_OPTS: "--no-chain-lint --no-bin-wrappers"
MAKEFLAGS: "-j4"
DEFAULT_TEST_TARGET: prove
DEVELOPER: 1
freebsd_instance:
image: freebsd-12-1-release-amd64
image_family: freebsd-12-2
memory: 2G
install_script:
pkg install -y gettext gmake perl5
create_user_script:

1
.gitattributes vendored
View File

@ -6,6 +6,7 @@
*.pm eol=lf diff=perl
*.py eol=lf diff=python
*.bat eol=crlf
CODE_OF_CONDUCT.md -whitespace
/Documentation/**/*.txt eol=lf
/command-list.txt eol=lf
/GIT-VERSION-GEN eol=lf

View File

@ -12,15 +12,9 @@ jobs:
check-whitespace:
runs-on: ubuntu-latest
steps:
- name: Set commit count
shell: bash
run: echo "COMMIT_DEPTH=$((1+$COMMITS))" >>$GITHUB_ENV
env:
COMMITS: ${{ github.event.pull_request.commits }}
- uses: actions/checkout@v2
with:
fetch-depth: ${{ env.COMMIT_DEPTH }}
fetch-depth: 0
- name: git log --check
id: check_out
@ -47,25 +41,9 @@ jobs:
echo "${dash} ${etc}"
;;
esac
done <<< $(git log --check --pretty=format:"---% h% s" -${{github.event.pull_request.commits}})
done <<< $(git log --check --pretty=format:"---% h% s" ${{github.event.pull_request.base.sha}}..)
if test -n "${log}"
then
echo "::set-output name=checkout::"${log}""
exit 2
fi
- name: Add Check Output as Comment
uses: actions/github-script@v3
id: add-comment
env:
log: ${{ steps.check_out.outputs.checkout }}
with:
script: |
await github.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `Whitespace errors found in workflow ${{ github.workflow }}:\n\n\`\`\`\n${process.env.log.replace(/\\n/g, "\n")}\n\`\`\``
})
if: ${{ failure() }}

105
.github/workflows/l10n.yml vendored Normal file
View File

@ -0,0 +1,105 @@
name: git-l10n
on: [push, pull_request_target]
jobs:
git-po-helper:
if: >-
endsWith(github.repository, '/git-po') ||
contains(github.head_ref, 'l10n') ||
contains(github.ref, 'l10n')
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- name: Setup base and head objects
id: setup-tips
run: |
if test "${{ github.event_name }}" = "pull_request_target"
then
base=${{ github.event.pull_request.base.sha }}
head=${{ github.event.pull_request.head.sha }}
else
base=${{ github.event.before }}
head=${{ github.event.after }}
fi
echo "::set-output name=base::$base"
echo "::set-output name=head::$head"
- name: Run partial clone
run: |
git -c init.defaultBranch=master init --bare .
git remote add \
--mirror=fetch \
origin \
https://github.com/${{ github.repository }}
# Fetch tips that may be unreachable from github.ref:
# - For a forced push, "$base" may be unreachable.
# - For a "pull_request_target" event, "$head" may be unreachable.
args=
for commit in \
${{ steps.setup-tips.outputs.base }} \
${{ steps.setup-tips.outputs.head }}
do
case $commit in
*[^0]*)
args="$args $commit"
;;
*)
# Should not fetch ZERO-OID.
;;
esac
done
git -c protocol.version=2 fetch \
--progress \
--no-tags \
--no-write-fetch-head \
--filter=blob:none \
origin \
${{ github.ref }} \
$args
- uses: actions/setup-go@v2
with:
go-version: '>=1.16'
- name: Install git-po-helper
run: go install github.com/git-l10n/git-po-helper@main
- name: Install other dependencies
run: |
sudo apt-get update -q &&
sudo apt-get install -q -y gettext
- name: Run git-po-helper
id: check-commits
run: |
exit_code=0
git-po-helper check-commits \
--github-action-event="${{ github.event_name }}" -- \
${{ steps.setup-tips.outputs.base }}..${{ steps.setup-tips.outputs.head }} \
>git-po-helper.out 2>&1 || exit_code=$?
if test $exit_code -ne 0 || grep -q WARNING git-po-helper.out
then
# Remove ANSI colors which are proper for console logs but not
# proper for PR comment.
echo "COMMENT_BODY<<EOF" >>$GITHUB_ENV
perl -pe 's/\e\[[0-9;]*m//g; s/\bEOF$//g' git-po-helper.out >>$GITHUB_ENV
echo "EOF" >>$GITHUB_ENV
fi
cat git-po-helper.out
exit $exit_code
- name: Create comment in pull request for report
uses: mshick/add-pr-comment@v1
if: >-
always() &&
github.event_name == 'pull_request_target' &&
env.COMMENT_BODY != ''
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: 'github-actions[bot]'
message: >
${{ steps.check-commits.outcome == 'failure' && 'Errors and warnings' || 'Warnings' }}
found by [git-po-helper](https://github.com/git-l10n/git-po-helper#readme) in workflow
[#${{ github.run_number }}](${{ env.GITHUB_SERVER_URL }}/${{ github.repository }}/actions/runs/${{ github.run_id }}):
```
${{ env.COMMENT_BODY }}
```

View File

@ -81,44 +81,21 @@ jobs:
if: needs.ci-config.outputs.enabled == 'yes'
runs-on: windows-latest
steps:
- uses: actions/checkout@v1
- name: download git-sdk-64-minimal
shell: bash
run: |
## Get artifact
urlbase=https://dev.azure.com/git-for-windows/git/_apis/build/builds
id=$(curl "$urlbase?definitions=22&statusFilter=completed&resultFilter=succeeded&\$top=1" |
jq -r ".value[] | .id")
download_url="$(curl "$urlbase/$id/artifacts" |
jq -r '.value[] | select(.name == "git-sdk-64-minimal").resource.downloadUrl')"
curl --connect-timeout 10 --retry 5 --retry-delay 0 --retry-max-time 240 \
-o artifacts.zip "$download_url"
## Unzip and remove the artifact
unzip artifacts.zip
rm artifacts.zip
- uses: actions/checkout@v2
- uses: git-for-windows/setup-git-for-windows-sdk@v1
- name: build
shell: powershell
shell: bash
env:
HOME: ${{runner.workspace}}
MSYSTEM: MINGW64
NO_PERL: 1
run: |
& .\git-sdk-64-minimal\usr\bin\bash.exe -lc @"
printf '%s\n' /git-sdk-64-minimal/ >>.git/info/exclude
ci/make-test-artifacts.sh artifacts
"@
- name: upload build artifacts
uses: actions/upload-artifact@v1
run: . /etc/profile && ci/make-test-artifacts.sh artifacts
- name: zip up tracked files
run: git archive -o artifacts/tracked.tar.gz HEAD
- name: upload tracked files and build artifacts
uses: actions/upload-artifact@v2
with:
name: windows-artifacts
path: artifacts
- name: upload git-sdk-64-minimal
uses: actions/upload-artifact@v1
with:
name: git-sdk-64-minimal
path: git-sdk-64-minimal
windows-test:
runs-on: windows-latest
needs: [windows-build]
@ -127,37 +104,25 @@ jobs:
matrix:
nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
steps:
- uses: actions/checkout@v1
- name: download build artifacts
uses: actions/download-artifact@v1
- name: download tracked files and build artifacts
uses: actions/download-artifact@v2
with:
name: windows-artifacts
path: ${{github.workspace}}
- name: extract build artifacts
- name: extract tracked files and build artifacts
shell: bash
run: tar xf artifacts.tar.gz
- name: download git-sdk-64-minimal
uses: actions/download-artifact@v1
with:
name: git-sdk-64-minimal
path: ${{github.workspace}}/git-sdk-64-minimal/
run: tar xf artifacts.tar.gz && tar xf tracked.tar.gz
- uses: git-for-windows/setup-git-for-windows-sdk@v1
- name: test
shell: powershell
run: |
& .\git-sdk-64-minimal\usr\bin\bash.exe -lc @"
# Let Git ignore the SDK
printf '%s\n' /git-sdk-64-minimal/ >>.git/info/exclude
ci/run-test-slice.sh ${{matrix.nr}} 10
"@
shell: bash
run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
- name: ci/print-test-failures.sh
if: failure()
shell: powershell
run: |
& .\git-sdk-64-minimal\usr\bin\bash.exe -lc ci/print-test-failures.sh
shell: bash
run: ci/print-test-failures.sh
- name: Upload failed tests' directories
if: failure() && env.FAILED_TEST_ARTIFACTS != ''
uses: actions/upload-artifact@v1
uses: actions/upload-artifact@v2
with:
name: failed-tests-windows
path: ${{env.FAILED_TEST_ARTIFACTS}}
@ -165,27 +130,17 @@ jobs:
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
env:
MSYSTEM: MINGW64
NO_PERL: 1
GIT_CONFIG_PARAMETERS: "'user.name=CI' 'user.email=ci@git'"
runs-on: windows-latest
steps:
- uses: actions/checkout@v1
- name: download git-sdk-64-minimal
shell: bash
run: |
## Get artifact
urlbase=https://dev.azure.com/git-for-windows/git/_apis/build/builds
id=$(curl "$urlbase?definitions=22&statusFilter=completed&resultFilter=succeeded&\$top=1" |
jq -r ".value[] | .id")
download_url="$(curl "$urlbase/$id/artifacts" |
jq -r '.value[] | select(.name == "git-sdk-64-minimal").resource.downloadUrl')"
curl --connect-timeout 10 --retry 5 --retry-delay 0 --retry-max-time 240 \
-o artifacts.zip "$download_url"
## Unzip and remove the artifact
unzip artifacts.zip
rm artifacts.zip
- uses: actions/checkout@v2
- uses: git-for-windows/setup-git-for-windows-sdk@v1
- name: initialize vcpkg
uses: actions/checkout@v2
with:
repository: 'microsoft/vcpkg'
path: 'compat/vcbuild/vcpkg'
- name: download vcpkg artifacts
shell: powershell
run: |
@ -198,75 +153,59 @@ jobs:
- name: add msbuild to PATH
uses: microsoft/setup-msbuild@v1
- name: copy dlls to root
shell: powershell
run: |
& compat\vcbuild\vcpkg_copy_dlls.bat release
if (!$?) { exit(1) }
shell: cmd
run: compat\vcbuild\vcpkg_copy_dlls.bat release
- name: generate Visual Studio solution
shell: bash
run: |
cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/x64-windows \
-DMSGFMT_EXE=`pwd`/git-sdk-64-minimal/mingw64/bin/msgfmt.exe -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON
-DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON
- name: MSBuild
run: msbuild git.sln -property:Configuration=Release -property:Platform=x64 -maxCpuCount:4 -property:PlatformToolset=v142
- name: bundle artifact tar
shell: powershell
shell: bash
env:
MSVC: 1
VCPKG_ROOT: ${{github.workspace}}\compat\vcbuild\vcpkg
run: |
& git-sdk-64-minimal\usr\bin\bash.exe -lc @"
mkdir -p artifacts &&
eval \"`$(make -n artifacts-tar INCLUDE_DLLS_IN_ARTIFACTS=YesPlease ARTIFACTS_DIRECTORY=artifacts 2>&1 | grep ^tar)\"
"@
- name: upload build artifacts
uses: actions/upload-artifact@v1
mkdir -p artifacts &&
eval "$(make -n artifacts-tar INCLUDE_DLLS_IN_ARTIFACTS=YesPlease ARTIFACTS_DIRECTORY=artifacts NO_GETTEXT=YesPlease 2>&1 | grep ^tar)"
- name: zip up tracked files
run: git archive -o artifacts/tracked.tar.gz HEAD
- name: upload tracked files and build artifacts
uses: actions/upload-artifact@v2
with:
name: vs-artifacts
path: artifacts
vs-test:
runs-on: windows-latest
needs: [vs-build, windows-build]
needs: vs-build
strategy:
fail-fast: false
matrix:
nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
steps:
- uses: actions/checkout@v1
- name: download git-sdk-64-minimal
uses: actions/download-artifact@v1
with:
name: git-sdk-64-minimal
path: ${{github.workspace}}/git-sdk-64-minimal/
- name: download build artifacts
uses: actions/download-artifact@v1
- uses: git-for-windows/setup-git-for-windows-sdk@v1
- name: download tracked files and build artifacts
uses: actions/download-artifact@v2
with:
name: vs-artifacts
path: ${{github.workspace}}
- name: extract build artifacts
- name: extract tracked files and build artifacts
shell: bash
run: tar xf artifacts.tar.gz
run: tar xf artifacts.tar.gz && tar xf tracked.tar.gz
- name: test
shell: powershell
shell: bash
env:
MSYSTEM: MINGW64
NO_SVN_TESTS: 1
GIT_TEST_SKIP_REBASE_P: 1
run: |
& .\git-sdk-64-minimal\usr\bin\bash.exe -lc @"
# Let Git ignore the SDK and the test-cache
printf '%s\n' /git-sdk-64-minimal/ /test-cache/ >>.git/info/exclude
ci/run-test-slice.sh ${{matrix.nr}} 10
"@
run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
- name: ci/print-test-failures.sh
if: failure()
shell: powershell
run: |
& .\git-sdk-64-minimal\usr\bin\bash.exe -lc ci/print-test-failures.sh
shell: bash
run: ci/print-test-failures.sh
- name: Upload failed tests' directories
if: failure() && env.FAILED_TEST_ARTIFACTS != ''
uses: actions/upload-artifact@v1
uses: actions/upload-artifact@v2
with:
name: failed-tests-windows
path: ${{env.FAILED_TEST_ARTIFACTS}}
@ -289,7 +228,10 @@ jobs:
- jobname: osx-gcc
cc: gcc
pool: macos-latest
- jobname: GETTEXT_POISON
- jobname: linux-gcc-default
cc: gcc
pool: ubuntu-latest
- jobname: linux-leaks
cc: gcc
pool: ubuntu-latest
env:
@ -297,14 +239,14 @@ jobs:
jobname: ${{matrix.vector.jobname}}
runs-on: ${{matrix.vector.pool}}
steps:
- uses: actions/checkout@v1
- uses: actions/checkout@v2
- run: ci/install-dependencies.sh
- run: ci/run-build-and-tests.sh
- run: ci/print-test-failures.sh
if: failure()
- name: Upload failed tests' directories
if: failure() && env.FAILED_TEST_ARTIFACTS != ''
uses: actions/upload-artifact@v1
uses: actions/upload-artifact@v2
with:
name: failed-tests-${{matrix.vector.jobname}}
path: ${{env.FAILED_TEST_ARTIFACTS}}
@ -319,6 +261,8 @@ jobs:
image: alpine
- jobname: Linux32
image: daald/ubuntu32:xenial
- jobname: pedantic
image: fedora
env:
jobname: ${{matrix.vector.jobname}}
runs-on: ubuntu-latest
@ -342,9 +286,29 @@ jobs:
jobname: StaticAnalysis
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v1
- uses: actions/checkout@v2
- run: ci/install-dependencies.sh
- run: ci/run-static-analysis.sh
sparse:
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
env:
jobname: sparse
runs-on: ubuntu-20.04
steps:
- name: Download a current `sparse` package
# Ubuntu's `sparse` version is too old for us
uses: git-for-windows/get-azure-pipelines-artifact@v0
with:
repository: git/git
definitionId: 10
artifact: sparse-20.04
- name: Install the current `sparse` package
run: sudo dpkg -i sparse-20.04/sparse_*.deb
- uses: actions/checkout@v2
- name: Install other dependencies
run: ci/install-dependencies.sh
- run: make sparse
documentation:
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
@ -352,6 +316,6 @@ jobs:
jobname: Documentation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- uses: actions/checkout@v2
- run: ci/install-dependencies.sh
- run: ci/test-documentation.sh

5
.gitignore vendored
View File

@ -33,6 +33,7 @@
/git-check-mailmap
/git-check-ref-format
/git-checkout
/git-checkout--worker
/git-checkout-index
/git-cherry
/git-cherry-pick
@ -124,7 +125,6 @@
/git-range-diff
/git-read-tree
/git-rebase
/git-rebase--preserve-merges
/git-receive-pack
/git-reflog
/git-remote
@ -162,6 +162,7 @@
/git-stripspace
/git-submodule
/git-submodule--helper
/git-subtree
/git-svn
/git-switch
/git-symbolic-ref
@ -188,6 +189,7 @@
/gitweb/static/gitweb.min.*
/config-list.h
/command-list.h
/hook-list.h
*.tar.gz
*.dsc
*.deb
@ -222,6 +224,7 @@
*.lib
*.res
*.sln
*.sp
*.suo
*.ncb
*.vcproj

View File

@ -220,6 +220,7 @@ Philipp A. Hartmann <pah@qo.cx> <ph@sorgh.de>
Philippe Bruhat <book@cpan.org>
Ralf Thielow <ralf.thielow@gmail.com> <ralf.thielow@googlemail.com>
Ramsay Jones <ramsay@ramsayjones.plus.com> <ramsay@ramsay1.demon.co.uk>
Ramkumar Ramachandra <r@artagnon.com> <artagnon@gmail.com>
Randall S. Becker <randall.becker@nexbridge.ca> <rsbecker@nexbridge.com>
René Scharfe <l.s.r@web.de> <rene.scharfe@lsrfire.ath.cx>
René Scharfe <l.s.r@web.de> Rene Scharfe

View File

@ -16,7 +16,7 @@ compiler:
matrix:
include:
- env: jobname=GETTEXT_POISON
- env: jobname=linux-gcc-default
os: linux
compiler:
addons:

View File

@ -8,73 +8,64 @@ this code of conduct may be banned from the community.
## Our Pledge
In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to make participation in our project and
our community a harassment-free experience for everyone, regardless of age,
body size, disability, ethnicity, sex characteristics, gender identity and
expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity and
orientation.
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to creating a positive environment
include:
Examples of behavior that contributes to a positive environment for our
community include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
Examples of unacceptable behavior by participants include:
Examples of unacceptable behavior include:
* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Our Responsibilities
## Enforcement Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all project spaces, and it also applies
when an individual is representing the project or its community in public
spaces. Examples of representing a project or community include using an
official project e-mail address, posting via an official social media account,
or acting as an appointed representative at an online or offline event.
Representation of a project may be further defined and clarified by project
maintainers.
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at git@sfconservancy.org. All
complaints will be reviewed and investigated and will result in a response
that is deemed necessary and appropriate to the circumstances. The project
team is obligated to maintain confidentiality with regard to the reporter of
an incident. Further details of specific enforcement policies may be posted
separately.
Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.
The project leadership team can be contacted by email as a whole at
reported to the community leaders responsible for enforcement at
git@sfconservancy.org, or individually:
- Ævar Arnfjörð Bjarmason <avarab@gmail.com>
@ -82,12 +73,73 @@ git@sfconservancy.org, or individually:
- Jeff King <peff@peff.net>
- Junio C Hamano <gitster@pobox.com>
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series
of actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within
the community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
version 2.0, available at
[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available
at [https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.0]: https://www.contributor-covenant.org/version/2/0/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq

View File

@ -14,4 +14,5 @@ manpage-base-url.xsl
SubmittingPatches.txt
tmp-doc-diff/
GIT-ASCIIDOCFLAGS
/.build/
/GIT-EXCLUDED-PROGRAMS

View File

@ -175,6 +175,11 @@ For shell scripts specifically (not exhaustive):
does not have such a problem.
- Even though "local" is not part of POSIX, we make heavy use of it
in our test suite. We do not use it in scripted Porcelains, and
hopefully nobody starts using "local" before they are reimplemented
in C ;-)
For C programs:
@ -498,7 +503,12 @@ Error Messages
- Do not end error messages with a full stop.
- Do not capitalize ("unable to open %s", not "Unable to open %s")
- Do not capitalize the first word, only because it is the first word
in the message ("unable to open %s", not "Unable to open %s"). But
"SHA-3 not supported" is fine, because the reason the first word is
capitalized is not because it is at the beginning of the sentence,
but because the word would be spelled in capital letters even when
it appeared in the middle of the sentence.
- Say what the error is first ("cannot open %s", not "%s: cannot open")
@ -541,6 +551,51 @@ Writing Documentation:
documentation, please see the documentation-related advice in the
Documentation/SubmittingPatches file).
In order to ensure the documentation is inclusive, avoid assuming
that an unspecified example person is male or female, and think
twice before using "he", "him", "she", or "her". Here are some
tips to avoid use of gendered pronouns:
- Prefer succinctness and matter-of-factly describing functionality
in the abstract. E.g.
--short:: Emit output in the short-format.
and avoid something like these overly verbose alternatives:
--short:: Use this to emit output in the short-format.
--short:: You can use this to get output in the short-format.
--short:: A user who prefers shorter output could....
--short:: Should a person and/or program want shorter output, he
she/they/it can...
This practice often eliminates the need to involve human actors in
your description, but it is a good practice regardless of the
avoidance of gendered pronouns.
- When it becomes awkward to stick to this style, prefer "you" when
addressing the the hypothetical user, and possibly "we" when
discussing how the program might react to the user. E.g.
You can use this option instead of --xyz, but we might remove
support for it in future versions.
while keeping in mind that you can probably be less verbose, e.g.
Use this instead of --xyz. This option might be removed in future
versions.
- If you still need to refer to an example person that is
third-person singular, you may resort to "singular they" to avoid
"he/she/him/her", e.g.
A contributor asks their upstream to pull from them.
Note that this sounds ungrammatical and unnatural to those who
learned that "they" is only used for third-person plural, e.g.
those who learn English as a second language in some parts of the
world.
Every user-visible change should be reflected in the documentation.
The same general rule as for code applies -- imitate the existing
conventions.

View File

@ -2,6 +2,8 @@
MAN1_TXT =
MAN5_TXT =
MAN7_TXT =
HOWTO_TXT =
DOC_DEP_TXT =
TECH_DOCS =
ARTICLES =
SP_ARTICLES =
@ -21,6 +23,7 @@ MAN1_TXT += gitweb.txt
MAN5_TXT += gitattributes.txt
MAN5_TXT += githooks.txt
MAN5_TXT += gitignore.txt
MAN5_TXT += gitmailmap.txt
MAN5_TXT += gitmodules.txt
MAN5_TXT += gitrepository-layout.txt
MAN5_TXT += gitweb.conf.txt
@ -41,6 +44,11 @@ MAN7_TXT += gittutorial-2.txt
MAN7_TXT += gittutorial.txt
MAN7_TXT += gitworkflows.txt
HOWTO_TXT += $(wildcard howto/*.txt)
DOC_DEP_TXT += $(wildcard *.txt)
DOC_DEP_TXT += $(wildcard config/*.txt)
ifdef MAN_FILTER
MAN_TXT = $(filter $(MAN_FILTER),$(MAN1_TXT) $(MAN5_TXT) $(MAN7_TXT))
else
@ -75,12 +83,14 @@ SP_ARTICLES += howto/rebuild-from-update-hook
SP_ARTICLES += howto/rebase-from-internal-branch
SP_ARTICLES += howto/keep-canonical-history-correct
SP_ARTICLES += howto/maintain-git
SP_ARTICLES += howto/coordinate-embargoed-releases
API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt)))
SP_ARTICLES += $(API_DOCS)
TECH_DOCS += MyFirstContribution
TECH_DOCS += MyFirstObjectWalk
TECH_DOCS += SubmittingPatches
TECH_DOCS += technical/bundle-format
TECH_DOCS += technical/hash-function-transition
TECH_DOCS += technical/http-protocol
TECH_DOCS += technical/index-format
@ -89,6 +99,7 @@ TECH_DOCS += technical/multi-pack-index
TECH_DOCS += technical/pack-format
TECH_DOCS += technical/pack-heuristics
TECH_DOCS += technical/pack-protocol
TECH_DOCS += technical/parallel-checkout
TECH_DOCS += technical/partial-clone
TECH_DOCS += technical/protocol-capabilities
TECH_DOCS += technical/protocol-common
@ -129,6 +140,7 @@ ASCIIDOC_CONF = -f asciidoc.conf
ASCIIDOC_COMMON = $(ASCIIDOC) $(ASCIIDOC_EXTRA) $(ASCIIDOC_CONF) \
-amanversion=$(GIT_VERSION) \
-amanmanual='Git Manual' -amansource='Git'
ASCIIDOC_DEPS = asciidoc.conf GIT-ASCIIDOCFLAGS
TXT_TO_HTML = $(ASCIIDOC_COMMON) -b $(ASCIIDOC_HTML)
TXT_TO_XML = $(ASCIIDOC_COMMON) -b $(ASCIIDOC_DOCBOOK)
MANPAGE_XSL = manpage-normal.xsl
@ -183,6 +195,7 @@ ASCIIDOC_DOCBOOK = docbook5
ASCIIDOC_EXTRA += -acompat-mode -atabsize=8
ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions
ASCIIDOC_EXTRA += -alitdd='&\#x2d;&\#x2d;'
ASCIIDOC_DEPS = asciidoctor-extensions.rb GIT-ASCIIDOCFLAGS
DBLATEX_COMMON =
XMLTO_EXTRA += --skip-validation
XMLTO_EXTRA += -x manpage.xsl
@ -213,6 +226,7 @@ endif
ifneq ($(findstring $(MAKEFLAGS),s),s)
ifndef V
QUIET = @
QUIET_ASCIIDOC = @echo ' ' ASCIIDOC $@;
QUIET_XMLTO = @echo ' ' XMLTO $@;
QUIET_DB2TEXI = @echo ' ' DB2TEXI $@;
@ -220,11 +234,15 @@ ifndef V
QUIET_DBLATEX = @echo ' ' DBLATEX $@;
QUIET_XSLTPROC = @echo ' ' XSLTPROC $@;
QUIET_GEN = @echo ' ' GEN $@;
QUIET_LINT = @echo ' ' LINT $@;
QUIET_STDERR = 2> /dev/null
QUIET_SUBDIR0 = +@subdir=
QUIET_SUBDIR1 = ;$(NO_SUBDIR) echo ' ' SUBDIR $$subdir; \
$(MAKE) $(PRINT_DIR) -C $$subdir
QUIET_LINT_GITLINK = @echo ' ' LINT GITLINK $<;
QUIET_LINT_MANSEC = @echo ' ' LINT MAN SEC $<;
QUIET_LINT_MANEND = @echo ' ' LINT MAN END $<;
export V
endif
endif
@ -272,7 +290,7 @@ install-html: html
../GIT-VERSION-FILE: FORCE
$(QUIET_SUBDIR0)../ $(QUIET_SUBDIR1) GIT-VERSION-FILE
ifneq ($(MAKECMDGOALS),clean)
ifneq ($(filter-out lint-docs clean,$(MAKECMDGOALS)),)
-include ../GIT-VERSION-FILE
endif
@ -283,10 +301,8 @@ docdep_prereqs = \
mergetools-list.made $(mergetools_txt) \
cmd-list.made $(cmds_txt)
doc.dep : $(docdep_prereqs) $(wildcard *.txt) $(wildcard config/*.txt) build-docdep.perl
$(QUIET_GEN)$(RM) $@+ $@ && \
$(PERL_PATH) ./build-docdep.perl >$@+ $(QUIET_STDERR) && \
mv $@+ $@
doc.dep : $(docdep_prereqs) $(DOC_DEP_TXT) build-docdep.perl
$(QUIET_GEN)$(PERL_PATH) ./build-docdep.perl >$@ $(QUIET_STDERR)
ifneq ($(MAKECMDGOALS),clean)
-include doc.dep
@ -306,8 +322,7 @@ cmds_txt = cmds-ancillaryinterrogators.txt \
$(cmds_txt): cmd-list.made
cmd-list.made: cmd-list.perl ../command-list.txt $(MAN1_TXT)
$(QUIET_GEN)$(RM) $@ && \
$(PERL_PATH) ./cmd-list.perl ../command-list.txt $(cmds_txt) $(QUIET_STDERR) && \
$(QUIET_GEN)$(PERL_PATH) ./cmd-list.perl ../command-list.txt $(cmds_txt) $(QUIET_STDERR) && \
date >$@
mergetools_txt = mergetools-diff.txt mergetools-merge.txt
@ -315,7 +330,7 @@ mergetools_txt = mergetools-diff.txt mergetools-merge.txt
$(mergetools_txt): mergetools-list.made
mergetools-list.made: ../git-mergetool--lib.sh $(wildcard ../mergetools/*)
$(QUIET_GEN)$(RM) $@ && \
$(QUIET_GEN) \
$(SHELL_PATH) -c 'MERGE_TOOLS_DIR=../mergetools && \
. ../git-mergetool--lib.sh && \
show_tool_names can_diff "* " || :' >mergetools-diff.txt && \
@ -334,6 +349,7 @@ GIT-ASCIIDOCFLAGS: FORCE
fi
clean:
$(RM) -rf .build/
$(RM) *.xml *.xml+ *.html *.html+ *.1 *.5 *.7
$(RM) *.texi *.texi+ *.texi++ git.info gitman.info
$(RM) *.pdf
@ -344,32 +360,23 @@ clean:
$(RM) manpage-base-url.xsl
$(RM) GIT-ASCIIDOCFLAGS
$(MAN_HTML): %.html : %.txt asciidoc.conf asciidoctor-extensions.rb GIT-ASCIIDOCFLAGS
$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
$(TXT_TO_HTML) -d manpage -o $@+ $< && \
mv $@+ $@
$(MAN_HTML): %.html : %.txt $(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC)$(TXT_TO_HTML) -d manpage -o $@ $<
$(OBSOLETE_HTML): %.html : %.txto asciidoc.conf asciidoctor-extensions.rb GIT-ASCIIDOCFLAGS
$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
$(TXT_TO_HTML) -o $@+ $< && \
mv $@+ $@
$(OBSOLETE_HTML): %.html : %.txto $(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC)$(TXT_TO_HTML) -o $@ $<
manpage-base-url.xsl: manpage-base-url.xsl.in
$(QUIET_GEN)sed "s|@@MAN_BASE_URL@@|$(MAN_BASE_URL)|" $< > $@
%.1 %.5 %.7 : %.xml manpage-base-url.xsl $(wildcard manpage*.xsl)
$(QUIET_XMLTO)$(RM) $@ && \
$(XMLTO) -m $(MANPAGE_XSL) $(XMLTO_EXTRA) man $<
$(QUIET_XMLTO)$(XMLTO) -m $(MANPAGE_XSL) $(XMLTO_EXTRA) man $<
%.xml : %.txt asciidoc.conf asciidoctor-extensions.rb GIT-ASCIIDOCFLAGS
$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
$(TXT_TO_XML) -d manpage -o $@+ $< && \
mv $@+ $@
%.xml : %.txt $(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC)$(TXT_TO_XML) -d manpage -o $@ $<
user-manual.xml: user-manual.txt user-manual.conf asciidoctor-extensions.rb GIT-ASCIIDOCFLAGS
$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
$(TXT_TO_XML) -d book -o $@+ $< && \
mv $@+ $@
$(QUIET_ASCIIDOC)$(TXT_TO_XML) -d book -o $@ $<
technical/api-index.txt: technical/api-index-skel.txt \
technical/api-index.sh $(patsubst %,%.txt,$(API_DOCS))
@ -390,46 +397,35 @@ XSLTOPTS += --stringparam html.stylesheet docbook-xsl.css
XSLTOPTS += --param generate.consistent.ids 1
user-manual.html: user-manual.xml $(XSLT)
$(QUIET_XSLTPROC)$(RM) $@+ $@ && \
xsltproc $(XSLTOPTS) -o $@+ $(XSLT) $< && \
mv $@+ $@
$(QUIET_XSLTPROC)xsltproc $(XSLTOPTS) -o $@ $(XSLT) $<
git.info: user-manual.texi
$(QUIET_MAKEINFO)$(MAKEINFO) --no-split -o $@ user-manual.texi
user-manual.texi: user-manual.xml
$(QUIET_DB2TEXI)$(RM) $@+ $@ && \
$(DOCBOOK2X_TEXI) user-manual.xml --encoding=UTF-8 --to-stdout >$@++ && \
$(PERL_PATH) fix-texi.perl <$@++ >$@+ && \
rm $@++ && \
mv $@+ $@
$(QUIET_DB2TEXI)$(DOCBOOK2X_TEXI) user-manual.xml --encoding=UTF-8 --to-stdout >$@+ && \
$(PERL_PATH) fix-texi.perl <$@+ >$@ && \
$(RM) $@+
user-manual.pdf: user-manual.xml
$(QUIET_DBLATEX)$(RM) $@+ $@ && \
$(DBLATEX) -o $@+ $(DBLATEX_COMMON) $< && \
mv $@+ $@
$(QUIET_DBLATEX)$(DBLATEX) -o $@ $(DBLATEX_COMMON) $<
gitman.texi: $(MAN_XML) cat-texi.perl texi.xsl
$(QUIET_DB2TEXI)$(RM) $@+ $@ && \
$(QUIET_DB2TEXI) \
($(foreach xml,$(sort $(MAN_XML)),xsltproc -o $(xml)+ texi.xsl $(xml) && \
$(DOCBOOK2X_TEXI) --encoding=UTF-8 --to-stdout $(xml)+ && \
rm $(xml)+ &&) true) > $@++ && \
$(PERL_PATH) cat-texi.perl $@ <$@++ >$@+ && \
rm $@++ && \
mv $@+ $@
$(RM) $(xml)+ &&) true) > $@+ && \
$(PERL_PATH) cat-texi.perl $@ <$@+ >$@ && \
$(RM) $@+
gitman.info: gitman.texi
$(QUIET_MAKEINFO)$(MAKEINFO) --no-split --no-validate $*.texi
$(patsubst %.txt,%.texi,$(MAN_TXT)): %.texi : %.xml
$(QUIET_DB2TEXI)$(RM) $@+ $@ && \
$(DOCBOOK2X_TEXI) --to-stdout $*.xml >$@+ && \
mv $@+ $@
$(QUIET_DB2TEXI)$(DOCBOOK2X_TEXI) --to-stdout $*.xml >$@
howto-index.txt: howto-index.sh $(wildcard howto/*.txt)
$(QUIET_GEN)$(RM) $@+ $@ && \
'$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(wildcard howto/*.txt)) >$@+ && \
mv $@+ $@
howto-index.txt: howto-index.sh $(HOWTO_TXT)
$(QUIET_GEN)'$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(HOWTO_TXT)) >$@
$(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt
$(QUIET_ASCIIDOC)$(TXT_TO_HTML) $*.txt
@ -437,11 +433,10 @@ $(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt
WEBDOC_DEST = /pub/software/scm/git/docs
howto/%.html: ASCIIDOC_EXTRA += -a git-relative-html-prefix=../
$(patsubst %.txt,%.html,$(wildcard howto/*.txt)): %.html : %.txt GIT-ASCIIDOCFLAGS
$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
$(patsubst %.txt,%.html,$(HOWTO_TXT)): %.html : %.txt GIT-ASCIIDOCFLAGS
$(QUIET_ASCIIDOC) \
sed -e '1,/^$$/d' $< | \
$(TXT_TO_HTML) - >$@+ && \
mv $@+ $@
$(TXT_TO_HTML) - >$@
install-webdoc : html
'$(SHELL_PATH_SQ)' ./install-webdoc.sh $(WEBDOC_DEST)
@ -468,12 +463,68 @@ quick-install-html: require-htmlrepo
print-man1:
@for i in $(MAN1_TXT); do echo $$i; done
lint-docs::
$(QUIET_LINT)$(PERL_PATH) lint-gitlink.perl
## Lint: Common
.build:
$(QUIET)mkdir $@
.build/lint-docs: | .build
$(QUIET)mkdir $@
## Lint: gitlink
.build/lint-docs/gitlink: | .build/lint-docs
$(QUIET)mkdir $@
.build/lint-docs/gitlink/howto: | .build/lint-docs/gitlink
$(QUIET)mkdir $@
.build/lint-docs/gitlink/config: | .build/lint-docs/gitlink
$(QUIET)mkdir $@
LINT_DOCS_GITLINK = $(patsubst %.txt,.build/lint-docs/gitlink/%.ok,$(HOWTO_TXT) $(DOC_DEP_TXT))
$(LINT_DOCS_GITLINK): | .build/lint-docs/gitlink
$(LINT_DOCS_GITLINK): | .build/lint-docs/gitlink/howto
$(LINT_DOCS_GITLINK): | .build/lint-docs/gitlink/config
$(LINT_DOCS_GITLINK): lint-gitlink.perl
$(LINT_DOCS_GITLINK): .build/lint-docs/gitlink/%.ok: %.txt
$(QUIET_LINT_GITLINK)$(PERL_PATH) lint-gitlink.perl \
$< \
$(HOWTO_TXT) $(DOC_DEP_TXT) \
--section=1 $(MAN1_TXT) \
--section=5 $(MAN5_TXT) \
--section=7 $(MAN7_TXT) >$@
.PHONY: lint-docs-gitlink
lint-docs-gitlink: $(LINT_DOCS_GITLINK)
## Lint: man-end-blurb
.build/lint-docs/man-end-blurb: | .build/lint-docs
$(QUIET)mkdir $@
LINT_DOCS_MAN_END_BLURB = $(patsubst %.txt,.build/lint-docs/man-end-blurb/%.ok,$(MAN_TXT))
$(LINT_DOCS_MAN_END_BLURB): | .build/lint-docs/man-end-blurb
$(LINT_DOCS_MAN_END_BLURB): lint-man-end-blurb.perl
$(LINT_DOCS_MAN_END_BLURB): .build/lint-docs/man-end-blurb/%.ok: %.txt
$(QUIET_LINT_MANEND)$(PERL_PATH) lint-man-end-blurb.perl $< >$@
.PHONY: lint-docs-man-end-blurb
lint-docs-man-end-blurb: $(LINT_DOCS_MAN_END_BLURB)
## Lint: man-section-order
.build/lint-docs/man-section-order: | .build/lint-docs
$(QUIET)mkdir $@
LINT_DOCS_MAN_SECTION_ORDER = $(patsubst %.txt,.build/lint-docs/man-section-order/%.ok,$(MAN_TXT))
$(LINT_DOCS_MAN_SECTION_ORDER): | .build/lint-docs/man-section-order
$(LINT_DOCS_MAN_SECTION_ORDER): lint-man-section-order.perl
$(LINT_DOCS_MAN_SECTION_ORDER): .build/lint-docs/man-section-order/%.ok: %.txt
$(QUIET_LINT_MANSEC)$(PERL_PATH) lint-man-section-order.perl $< >$@
.PHONY: lint-docs-man-section-order
lint-docs-man-section-order: $(LINT_DOCS_MAN_SECTION_ORDER)
## Lint: list of targets above
.PHONY: lint-docs
lint-docs: lint-docs-gitlink
lint-docs: lint-docs-man-end-blurb
lint-docs: lint-docs-man-section-order
ifeq ($(wildcard po/Makefile),po/Makefile)
doc-l10n install-l10n::
$(MAKE) -C po $@
endif
# Delete the target file on error
.DELETE_ON_ERROR:
.PHONY: FORCE

View File

@ -47,7 +47,7 @@ Veteran contributors who are especially interested in helping mentor newcomers
are present on the list. In order to avoid search indexers, group membership is
required to view messages; anyone can join and no approval is required.
==== https://webchat.freenode.net/#git-devel[#git-devel] on Freenode
==== https://web.libera.chat/#git-devel[#git-devel] on Libera Chat
This IRC channel is for conversations between Git contributors. If someone is
currently online and knows the answer to your question, you can receive help
@ -664,7 +664,7 @@ mention the right animal somewhere:
----
test_expect_success 'runs correctly with no args and good output' '
git psuh >actual &&
test_i18ngrep Pony actual
grep Pony actual
'
----
@ -827,7 +827,7 @@ either examining recent pull requests where someone has been granted `/allow`
(https://github.com/gitgitgadget/git/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aopen+%22%2Fallow%22[Search:
is:pr is:open "/allow"]), in which case both the author and the person who
granted the `/allow` can now `/allow` you, or by inquiring on the
https://webchat.freenode.net/#git-devel[#git-devel] IRC channel on Freenode
https://web.libera.chat/#git-devel[#git-devel] IRC channel on Libera Chat
linking your pull request and asking for someone to `/allow` you.
If the CI fails, you can update your changes with `git rebase -i` and push your
@ -1029,22 +1029,42 @@ kidding - be patient!)
[[v2-git-send-email]]
=== Sending v2
Skip ahead to <<reviewing,Responding to Reviews>> for information on how to
handle comments from reviewers. Continue this section when your topic branch is
shaped the way you want it to look for your patchset v2.
This section will focus on how to send a v2 of your patchset. To learn what
should go into v2, skip ahead to <<reviewing,Responding to Reviews>> for
information on how to handle comments from reviewers.
When you're ready with the next iteration of your patch, the process is fairly
similar.
First, generate your v2 patches again:
We'll reuse our `psuh` topic branch for v2. Before we make any changes, we'll
mark the tip of our v1 branch for easy reference:
----
$ git format-patch -v2 --cover-letter -o psuh/ master..psuh
$ git checkout psuh
$ git branch psuh-v1
----
This will add your v2 patches, all named like `v2-000n-my-commit-subject.patch`,
to the `psuh/` directory. You may notice that they are sitting alongside the v1
patches; that's fine, but be careful when you are ready to send them.
Refine your patch series by using `git rebase -i` to adjust commits based upon
reviewer comments. Once the patch series is ready for submission, generate your
patches again, but with some new flags:
----
$ git format-patch -v2 --cover-letter -o psuh/ --range-diff master..psuh-v1 master..
----
The `--range-diff master..psuh-v1` parameter tells `format-patch` to include a
range-diff between `psuh-v1` and `psuh` in the cover letter (see
linkgit:git-range-diff[1]). This helps tell reviewers about the differences
between your v1 and v2 patches.
The `-v2` parameter tells `format-patch` to output your patches
as version "2". For instance, you may notice that your v2 patches are
all named like `v2-000n-my-commit-subject.patch`. `-v2` will also format
your patches by prefixing them with "[PATCH v2]" instead of "[PATCH]",
and your range-diff will be prefaced with "Range-diff against v1".
Afer you run this command, `format-patch` will output the patches to the `psuh/`
directory, alongside the v1 patches. Using a single directory makes it easy to
refer to the old v1 patches while proofreading the v2 patches, but you will need
to be careful to send out only the v2 patches. We will use a pattern like
"psuh/v2-*.patch" (not "psuh/*.patch", which would match v1 and v2 patches).
Edit your cover letter again. Now is a good time to mention what's different
between your last version and now, if it's something significant. You do not
@ -1082,7 +1102,7 @@ to the command:
----
$ git send-email --to=target@example.com
--in-reply-to="<foo.12345.author@example.com>"
psuh/v2*
psuh/v2-*.patch
----
[[single-patch]]

View File

@ -691,7 +691,7 @@ help understand. In our case, that means we omit trees and blobs not directly
referenced by `HEAD` or `HEAD`'s history, because we begin the walk with only
`HEAD` in the `pending` list.)
First, we'll need to `#include "list-objects-filter-options.h`" and set up the
First, we'll need to `#include "list-objects-filter-options.h"` and set up the
`struct list_objects_filter_options` at the top of the function.
----
@ -779,7 +779,7 @@ Count all the objects within and modify the print statement:
while ((oid = oidset_iter_next(&oit)))
omitted_count++;
printf("commits %d\nblobs %d\ntags %d\ntrees%d\nomitted %d\n",
printf("commits %d\nblobs %d\ntags %d\ntrees %d\nomitted %d\n",
commit_count, blob_count, tag_count, tree_count, omitted_count);
----

View File

@ -50,7 +50,7 @@ Fixes since v1.6.0.2
if the working tree is currently dirty.
* "git for-each-ref --format=%(subject)" fixed for commits with no
no newline in the message body.
newline in the message body.
* "git remote" fixed to protect printf from user input.

View File

@ -365,7 +365,7 @@ details).
(merge 2fbd4f9 mh/maint-lockfile-overflow later to maint).
* Invocations of "git checkout" used internally by "git rebase" were
counted as "checkout", and affected later "git checkout -" to the
counted as "checkout", and affected later "git checkout -", which took
the user to an unexpected place.
(merge 3bed291 rr/rebase-checkout-reflog later to maint).

View File

@ -184,8 +184,8 @@ Performance, Internal Implementation, Development Support etc.
the ref backend in use, as its format is much richer than the
normal refs, and written directly by "git fetch" as a plain file..
* An unused binary has been discarded, and and a bunch of commands
have been turned into into built-in.
* An unused binary has been discarded, and a bunch of commands
have been turned into built-in.
* A handful of places in in-tree code still relied on being able to
execute the git subcommands, especially built-ins, in "git-foo"

View File

@ -1,24 +0,0 @@
Git v2.30.2 Release Notes
=========================
This release addresses the security issue CVE-2022-24765.
Fixes since v2.30.2
-------------------
* Build fix on Windows.
* Fix `GIT_CEILING_DIRECTORIES` with Windows-style root directories.
* CVE-2022-24765:
On multi-user machines, Git users might find themselves
unexpectedly in a Git worktree, e.g. when another user created a
repository in `C:\.git`, in a mounted network drive or in a
scratch space. Merely having a Git-aware prompt that runs `git
status` (or `git diff`) and navigating to a directory which is
supposedly not a Git worktree, or opening such a directory in an
editor or IDE such as VS Code or Atom, will potentially run
commands defined by that other user.
Credit for finding this vulnerability goes to 俞晨东; The fix was
authored by Johannes Schindelin.

View File

@ -1,21 +0,0 @@
Git v2.30.4 Release Notes
=========================
This release contains minor fix-ups for the changes that went into
Git 2.30.3, which was made to address CVE-2022-24765.
* The code that was meant to parse the new `safe.directory`
configuration variable was not checking what configuration
variable was being fed to it, which has been corrected.
* '*' can be used as the value for the `safe.directory` variable to
signal that the user considers that any directory is safe.
Derrick Stolee (2):
t0033: add tests for safe.directory
setup: opt-out of check with safe.directory=*
Matheus Valadares (1):
setup: fix safe.directory key not being checked

View File

@ -1,12 +0,0 @@
Git v2.30.5 Release Notes
=========================
This release contains minor fix-ups for the changes that went into
Git 2.30.3 and 2.30.4, addressing CVE-2022-29187.
* The safety check that verifies a safe ownership of the Git
worktree is now extended to also cover the ownership of the Git
directory (and the `.git` file, if there is any).
Carlo Marcelo Arenas Belón (1):
setup: tighten ownership checks post CVE-2022-24765

View File

@ -1,60 +0,0 @@
Git v2.30.6 Release Notes
=========================
This release addresses the security issues CVE-2022-39253 and
CVE-2022-39260.
Fixes since v2.30.5
-------------------
* CVE-2022-39253:
When relying on the `--local` clone optimization, Git dereferences
symbolic links in the source repository before creating hardlinks
(or copies) of the dereferenced link in the destination repository.
This can lead to surprising behavior where arbitrary files are
present in a repository's `$GIT_DIR` when cloning from a malicious
repository.
Git will no longer dereference symbolic links via the `--local`
clone mechanism, and will instead refuse to clone repositories that
have symbolic links present in the `$GIT_DIR/objects` directory.
Additionally, the value of `protocol.file.allow` is changed to be
"user" by default.
* CVE-2022-39260:
An overly-long command string given to `git shell` can result in
overflow in `split_cmdline()`, leading to arbitrary heap writes and
remote code execution when `git shell` is exposed and the directory
`$HOME/git-shell-commands` exists.
`git shell` is taught to refuse interactive commands that are
longer than 4MiB in size. `split_cmdline()` is hardened to reject
inputs larger than 2GiB.
Credit for finding CVE-2022-39253 goes to Cory Snider of Mirantis. The
fix was authored by Taylor Blau, with help from Johannes Schindelin.
Credit for finding CVE-2022-39260 goes to Kevin Backhouse of GitHub.
The fix was authored by Kevin Backhouse, Jeff King, and Taylor Blau.
Jeff King (2):
shell: add basic tests
shell: limit size of interactive commands
Kevin Backhouse (1):
alias.c: reject too-long cmdline strings in split_cmdline()
Taylor Blau (11):
builtin/clone.c: disallow `--local` clones with symlinks
t/lib-submodule-update.sh: allow local submodules
t/t1NNN: allow local submodules
t/2NNNN: allow local submodules
t/t3NNN: allow local submodules
t/t4NNN: allow local submodules
t/t5NNN: allow local submodules
t/t6NNN: allow local submodules
t/t7NNN: allow local submodules
t/t9NNN: allow local submodules
transport: make `protocol.file.allow` be "user" by default

View File

@ -0,0 +1,365 @@
Git 2.31 Release Notes
======================
Updates since v2.30
-------------------
Backward incompatible and other important changes
* The "pack-redundant" command, which has been left stale with almost
unusable performance issues, now warns loudly when it gets used, as
we no longer want to recommend its use (instead just "repack -d"
instead).
* The development community has adopted Contributor Covenant v2.0 to
update from v1.4 that we have been using.
* The support for deprecated PCRE1 library has been dropped.
* Fixes for CVE-2021-21300 in Git 2.30.2 (and earlier) is included.
UI, Workflows & Features
* The "--format=%(trailers)" mechanism gets enhanced to make it
easier to design output for machine consumption.
* When a user does not tell "git pull" to use rebase or merge, the
command gives a loud message telling a user to choose between
rebase or merge but creates a merge anyway, forcing users who would
want to rebase to redo the operation. Fix an early part of this
problem by tightening the condition to give the message---there is
no reason to stop or force the user to choose between rebase or
merge if the history fast-forwards.
* The configuration variable 'core.abbrev' can be set to 'no' to
force no abbreviation regardless of the hash algorithm.
* "git rev-parse" can be explicitly told to give output as absolute
or relative path with the `--path-format=(absolute|relative)` option.
* Bash completion (in contrib/) update to make it easier for
end-users to add completion for their custom "git" subcommands.
* "git maintenance" learned to drive scheduled maintenance on
platforms whose native scheduling methods are not 'cron'.
* After expiring a reflog and making a single commit, the reflog for
the branch would record a single entry that knows both @{0} and
@{1}, but we failed to answer "what commit were we on?", i.e. @{1}
* "git bundle" learns "--stdin" option to read its refs from the
standard input. Also, it now does not lose refs whey they point
at the same object.
* "git log" learned a new "--diff-merges=<how>" option.
* "git ls-files" can and does show multiple entries when the index is
unmerged, which is a source for confusion unless -s/-u option is in
use. A new option --deduplicate has been introduced.
* `git worktree list` now annotates worktrees as prunable, shows
locked and prunable attributes in --porcelain mode, and gained
a --verbose option.
* "git clone" tries to locally check out the branch pointed at by
HEAD of the remote repository after it is done, but the protocol
did not convey the information necessary to do so when copying an
empty repository. The protocol v2 learned how to do so.
* There are other ways than ".." for a single token to denote a
"commit range", namely "<rev>^!" and "<rev>^-<n>", but "git
range-diff" did not understand them.
* The "git range-diff" command learned "--(left|right)-only" option
to show only one side of the compared range.
* "git mergetool" feeds three versions (base, local and remote) of
a conflicted path unmodified. The command learned to optionally
prepare these files with unconflicted parts already resolved.
* The .mailmap is documented to be read only from the root level of a
working tree, but a stray file in a bare repository also was read
by accident, which has been corrected.
* "git maintenance" tool learned a new "pack-refs" maintenance task.
* The error message given when a configuration variable that is
expected to have a boolean value has been improved.
* Signed commits and tags now allow verification of objects, whose
two object names (one in SHA-1, the other in SHA-256) are both
signed.
* "git rev-list" command learned "--disk-usage" option.
* "git {diff,log} --{skip,rotate}-to=<path>" allows the user to
discard diff output for early paths or move them to the end of the
output.
* "git difftool" learned "--skip-to=<path>" option to restart an
interrupted session from an arbitrary path.
* "git grep" has been tweaked to be limited to the sparse checkout
paths.
* "git rebase --[no-]fork-point" gained a configuration variable
rebase.forkPoint so that users do not have to keep specifying a
non-default setting.
Performance, Internal Implementation, Development Support etc.
* A 3-year old test that was not testing anything useful has been
corrected.
* Retire more names with "sha1" in it.
* The topological walk codepath is covered by new trace2 stats.
* Update the Code-of-conduct to version 2.0 from the upstream (we've
been using version 1.4).
* "git mktag" validates its input using its own rules before writing
a tag object---it has been updated to share the logic with "git
fsck".
* Two new ways to feed configuration variable-value pairs via
environment variables have been introduced, and the way
GIT_CONFIG_PARAMETERS encodes variable/value pairs has been tweaked
to make it more robust.
* Tests have been updated so that they do not to get affected by the
name of the default branch "git init" creates.
* "git fetch" learns to treat ref updates atomically in all-or-none
fashion, just like "git push" does, with the new "--atomic" option.
* The peel_ref() API has been replaced with peel_iterated_oid().
* The .use_shell flag in struct child_process that is passed to
run_command() API has been clarified with a bit more documentation.
* Document, clean-up and optimize the code around the cache-tree
extension in the index.
* The ls-refs protocol operation has been optimized to narrow the
sub-hierarchy of refs/ it walks to produce response.
* When removing many branches and tags, the code used to do so one
ref at a time. There is another API it can use to delete multiple
refs, and it makes quite a lot of performance difference when the
refs are packed.
* The "pack-objects" command needs to iterate over all the tags when
automatic tag following is enabled, but it actually iterated over
all refs and then discarded everything outside "refs/tags/"
hierarchy, which was quite wasteful.
* A perf script was made more portable.
* Our setting of GitHub CI test jobs were a bit too eager to give up
once there is even one failure found. Tweak the knob to allow
other jobs keep running even when we see a failure, so that we can
find more failures in a single run.
* We've carried compatibility codepaths for compilers without
variadic macros for quite some time, but the world may be ready for
them to be removed. Force compilation failure on exotic platforms
where variadic macros are not available to find out who screams in
such a way that we can easily revert if it turns out that the world
is not yet ready.
* Code clean-up to ensure our use of hashtables using object names as
keys use the "struct object_id" objects, not the raw hash values.
* Lose the debugging aid that may have been useful in the past, but
no longer is, in the "grep" codepaths.
* Some pretty-format specifiers do not need the data in commit object
(e.g. "%H"), but we were over-eager to load and parse it, which has
been made even lazier.
* Get rid of "GETTEXT_POISON" support altogether, which may or may
not be controversial.
* Introduce an on-disk file to record revindex for packdata, which
traditionally was always created on the fly and only in-core.
* The commit-graph learned to use corrected commit dates instead of
the generation number to help topological revision traversal.
* Piecemeal of rewrite of "git bisect" in C continues.
* When a pager spawned by us exited, the trace log did not record its
exit status correctly, which has been corrected.
* Removal of GIT_TEST_GETTEXT_POISON continues.
* The code to implement "git merge-base --independent" was poorly
done and was kept from the very beginning of the feature.
* Preliminary changes to fsmonitor integration.
* Performance improvements for rename detection.
* The common code to deal with "chunked file format" that is shared
by the multi-pack-index and commit-graph files have been factored
out, to help codepaths for both filetypes to become more robust.
* The approach to "fsck" the incoming objects in "index-pack" is
attractive for performance reasons (we have them already in core,
inflated and ready to be inspected), but fundamentally cannot be
applied fully when we receive more than one pack stream, as a tree
object in one pack may refer to a blob object in another pack as
".gitmodules", when we want to inspect blobs that are used as
".gitmodules" file, for example. Teach "index-pack" to emit
objects that must be inspected later and check them in the calling
"fetch-pack" process.
* The logic to handle "trailer" related placeholders in the
"--format=" mechanisms in the "log" family and "for-each-ref"
family is getting unified.
* Raise the buffer size used when writing the index file out from
(obviously too small) 8kB to (clearly sufficiently large) 128kB.
* It is reported that open() on some platforms (e.g. macOS Big Sur)
can return EINTR even though our timers are set up with SA_RESTART.
A workaround has been implemented and enabled for macOS to rerun
open() transparently from the caller when this happens.
Fixes since v2.30
-----------------
* Diagnose command line error of "git rebase" early.
* Clean up option descriptions in "git cmd --help".
* "git stash" did not work well in a sparsely checked out working
tree.
* Some tests expect that "ls -l" output has either '-' or 'x' for
group executable bit, but setgid bit can be inherited from parent
directory and make these fields 'S' or 's' instead, causing test
failures.
* "git for-each-repo --config=<var> <cmd>" should not run <cmd> for
any repository when the configuration variable <var> is not defined
even once.
* Fix 2.29 regression where "git mergetool --tool-help" fails to list
all the available tools.
* Fix for procedure to building CI test environment for mac.
* The implementation of "git branch --sort" wrt the detached HEAD
display has always been hacky, which has been cleaned up.
* Newline characters in the host and path part of git:// URL are
now forbidden.
* "git diff" showed a submodule working tree with untracked cruft as
"Submodule commit <objectname>-dirty", but a natural expectation is
that the "-dirty" indicator would align with "git describe --dirty",
which does not consider having untracked files in the working tree
as source of dirtiness. The inconsistency has been fixed.
* When more than one commit with the same patch ID appears on one
side, "git log --cherry-pick A...B" did not exclude them all when a
commit with the same patch ID appears on the other side. Now it
does.
* Documentation for "git fsck" lost stale bits that has become
incorrect.
* Doc fix for packfile URI feature.
* When "git rebase -i" processes "fixup" insn, there is no reason to
clean up the commit log message, but we did the usual stripspace
processing. This has been corrected.
(merge f7d42ceec5 js/rebase-i-commit-cleanup-fix later to maint).
* Fix in passing custom args from "git clone" to "upload-pack" on the
other side.
(merge ad6b5fefbd jv/upload-pack-filter-spec-quotefix later to maint).
* The command line completion (in contrib/) completed "git branch -d"
with branch names, but "git branch -D" offered tagnames in addition,
which has been corrected. "git branch -M" had the same problem.
(merge 27dc071b9a jk/complete-branch-force-delete later to maint).
* When commands are started from a subdirectory, they may have to
compare the path to the subdirectory (called prefix and found out
from $(pwd)) with the tracked paths. On macOS, $(pwd) and
readdir() yield decomposed path, while the tracked paths are
usually normalized to the precomposed form, causing mismatch. This
has been fixed by taking the same approach used to normalize the
command line arguments.
(merge 5c327502db tb/precompose-prefix-too later to maint).
* Even though invocations of "die()" were logged to the trace2
system, "BUG()"s were not, which has been corrected.
(merge 0a9dde4a04 jt/trace2-BUG later to maint).
* "git grep --untracked" is meant to be "let's ALSO find in these
files on the filesystem" when looking for matches in the working
tree files, and does not make any sense if the primary search is
done against the index, or the tree objects. The "--cached" and
"--untracked" options have been marked as mutually incompatible.
(merge 0c5d83b248 mt/grep-cached-untracked later to maint).
* Fix "git fsck --name-objects" which apparently has not been used by
anybody who is motivated enough to report breakage.
(merge e89f89361c js/fsck-name-objects-fix later to maint).
* Avoid individual tests in t5411 from getting affected by each other
by forcing them to use separate output files during the test.
(merge 822ee894f6 jx/t5411-unique-filenames later to maint).
* Test to make sure "git rev-parse one-thing one-thing" gives
the same thing twice (when one-thing is --since=X).
(merge a5cdca4520 ew/rev-parse-since-test later to maint).
* When certain features (e.g. grafts) used in the repository are
incompatible with the use of the commit-graph, we used to silently
turned commit-graph off; we now tell the user what we are doing.
(merge c85eec7fc3 js/commit-graph-warning later to maint).
* Objects that lost references can be pruned away, even when they
have notes attached to it (and these notes will become dangling,
which in turn can be pruned with "git notes prune"). This has been
clarified in the documentation.
(merge fa9ab027ba mz/doc-notes-are-not-anchors later to maint).
* The error codepath around the "--temp/--prefix" feature of "git
checkout-index" has been improved.
(merge 3f7ba60350 mt/checkout-index-corner-cases later to maint).
* The "git maintenance register" command had trouble registering bare
repositories, which had been corrected.
* A handful of multi-word configuration variable names in
documentation that are spelled in all lowercase have been corrected
to use the more canonical camelCase.
(merge 7dd0eaa39c dl/doc-config-camelcase later to maint).
* "git push $there --delete ''" should have been diagnosed as an
error, but instead turned into a matching push, which has been
corrected.
(merge 20e416409f jc/push-delete-nothing later to maint).
* Test script modernization.
(merge 488acf15df sv/t7001-modernize later to maint).
* An under-allocation for the untracked cache data has been corrected.
(merge 6347d649bc jh/untracked-cache-fix later to maint).
* Other code cleanup, docfix, build fix, etc.
(merge e3f5da7e60 sg/t7800-difftool-robustify later to maint).
(merge 9d336655ba js/doc-proto-v2-response-end later to maint).
(merge 1b5b8cf072 jc/maint-column-doc-typofix later to maint).
(merge 3a837b58e3 cw/pack-config-doc later to maint).
(merge 01168a9d89 ug/doc-commit-approxidate later to maint).
(merge b865734760 js/params-vs-args later to maint).

View File

@ -0,0 +1,27 @@
Git 2.31.1 Release Notes
========================
Fixes since v2.31
-----------------
* The fsmonitor interface read from its input without making sure
there is something to read from. This bug is new in 2.31
timeframe.
* The data structure used by fsmonitor interface was not properly
duplicated during an in-core merge, leading to use-after-free etc.
* "git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well. This regression
has been corrected.
* Fix macros that can silently inject unintended null-statements.
* CALLOC_ARRAY() macro replaces many uses of xcalloc().
* Update insn in Makefile comments to run fuzz-all target.
* Fix a corner case bug in "git mv" on case insensitive systems,
which was introduced in 2.29 timeframe.
Also contains various documentation updates and code clean-ups.

View File

@ -0,0 +1,416 @@
Git 2.32 Release Notes
======================
Backward compatibility notes
----------------------------
* ".gitattributes", ".gitignore", and ".mailmap" files that are
symbolic links are ignored.
* "git apply --3way" used to first attempt a straight application,
and only fell back to the 3-way merge algorithm when the stright
application failed. Starting with this version, the command will
first try the 3-way merge algorithm and only when it fails (either
resulting with conflict or the base versions of blobs are missing),
falls back to the usual patch application.
Updates since v2.31
-------------------
UI, Workflows & Features
* It does not make sense to make ".gitattributes", ".gitignore" and
".mailmap" symlinks, as they are supposed to be usable from the
object store (think: bare repositories where HEAD:.mailmap etc. are
used). When these files are symbolic links, we used to read the
contents of the files pointed by them by mistake, which has been
corrected.
* "git stash show" learned to optionally show untracked part of the
stash.
* "git log --format='...'" learned "%(describe)" placeholder.
* "git repack" so far has been only capable of repacking everything
under the sun into a single pack (or split by size). A cleverer
strategy to reduce the cost of repacking a repository has been
introduced.
* The http codepath learned to let the credential layer to cache the
password used to unlock a certificate that has successfully been
used.
* "git commit --fixup=<commit>", which was to tweak the changes made
to the contents while keeping the original log message intact,
learned "--fixup=(amend|reword):<commit>", that can be used to
tweak both the message and the contents, and only the message,
respectively.
* "git send-email" learned to honor the core.hooksPath configuration.
* "git format-patch -v<n>" learned to allow a reroll count that is
not an integer.
* "git commit" learned "--trailer <key>[=<value>]" option; together
with the interpret-trailers command, this will make it easier to
support custom trailers.
* "git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.
* A configuration variable has been added to force tips of certain
refs to be given a reachability bitmap.
* "gitweb" learned "e-mail privacy" feature to redact strings that
look like e-mail addresses on various pages.
* "git apply --3way" has always been "to fall back to 3-way merge
only when straight application fails". Swap the order of falling
back so that 3-way is always attempted first (only when the option
is given, of course) and then straight patch application is used as
a fallback when it fails.
* "git apply" now takes "--3way" and "--cached" at the same time, and
work and record results only in the index.
* The command line completion (in contrib/) has learned that
CHERRY_PICK_HEAD is a possible pseudo-ref.
* Userdiff patterns for "Scheme" has been added.
* "git log" learned "--diff-merges=<style>" option, with an
associated configuration variable log.diffMerges.
* "git log --format=..." placeholders learned %ah/%ch placeholders to
request the --date=human output.
* Replace GIT_CONFIG_NOSYSTEM mechanism to decline from reading the
system-wide configuration file with GIT_CONFIG_SYSTEM that lets
users specify from which file to read the system-wide configuration
(setting it to an empty file would essentially be the same as
setting NOSYSTEM), and introduce GIT_CONFIG_GLOBAL to override the
per-user configuration in $HOME/.gitconfig.
* "git add" and "git rm" learned not to touch those paths that are
outside of sparse checkout.
* "git rev-list" learns the "--filter=object:type=<type>" option,
which can be used to exclude objects of the given kind from the
packfile generated by pack-objects.
* The command line completion (in contrib/) for "git stash" has been
updated.
* "git subtree" updates.
* It is now documented that "format-patch" skips merges.
* Options to "git pack-objects" that take numeric values like
--window and --depth should not accept negative values; the input
validation has been tightened.
* The way the command line specified by the trailer.<token>.command
configuration variable receives the end-user supplied value was
both error prone and misleading. An alternative to achieve the
same goal in a safer and more intuitive way has been added, as
the trailer.<token>.cmd configuration variable, to replace it.
* "git add -i --dry-run" does not dry-run, which was surprising. The
combination of options has taught to error out.
* "git push" learns to discover common ancestor with the receiving
end over protocol v2. This will hopefully make "git push" as
efficient as "git fetch" in avoiding objects from getting
transferred unnecessarily.
* "git mailinfo" (hence "git am") learned the "--quoted-cr" option to
control how lines ending with CRLF wrapped in base64 or qp are
handled.
Performance, Internal Implementation, Development Support etc.
* Rename detection rework continues.
* GIT_TEST_FAIL_PREREQS is a mechanism to skip test pieces with
prerequisites to catch broken tests that depend on the side effects
of optional pieces, but did not work at all when negative
prerequisites were involved.
(merge 27d578d904 jk/fail-prereq-testfix later to maint).
* "git diff-index" codepath has been taught to trust fsmonitor status
to reduce number of lstat() calls.
(merge 7e5aa13d2c nk/diff-index-fsmonitor later to maint).
* Reorganize Makefile to allow building git.o and other essential
objects without extra stuff needed only for testing.
* Preparatory API changes for parallel checkout.
* A simple IPC interface gets introduced to build services like
fsmonitor on top.
* Fsck API clean-up.
* SECURITY.md that is facing individual contributors and end users
has been introduced. Also a procedure to follow when preparing
embargoed releases has been spelled out.
(merge 09420b7648 js/security-md later to maint).
* Optimize "rev-list --use-bitmap-index --objects" corner case that
uses negative tags as the stopping points.
* CMake update for vsbuild.
* An on-disk reverse-index to map the in-pack location of an object
back to its object name across multiple packfiles is introduced.
* Generate [ec]tags under $(QUIET_GEN).
* Clean-up codepaths that implements "git send-email --validate"
option and improves the message from it.
* The last remnant of gettext-poison has been removed.
* The test framework has been taught to optionally turn the default
merge strategy to "ort" throughout the system where we use
three-way merges internally, like cherry-pick, rebase etc.,
primarily to enhance its test coverage (the strategy has been
available as an explicit "-s ort" choice).
* A bit of code clean-up and a lot of test clean-up around userdiff
area.
* Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).
* When packet_write() fails, we gave an extra error message
unnecessarily, which has been corrected.
* The checkout machinery has been taught to perform the actual
write-out of the files in parallel when able.
* Show errno in the trace output in the error codepath that calls
read_raw_ref method.
* Effort to make the command line completion (in contrib/) safe with
"set -u" continues.
* Tweak a few tests for "log --format=..." that show timestamps in
various formats.
* The reflog expiry machinery has been taught to emit trace events.
* Over-the-wire protocol learns a new request type to ask for object
sizes given a list of object names.
Fixes since v2.31
-----------------
* The fsmonitor interface read from its input without making sure
there is something to read from. This bug is new in 2.31
timeframe.
* The data structure used by fsmonitor interface was not properly
duplicated during an in-core merge, leading to use-after-free etc.
* "git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well. This regression
has been corrected.
* Fix macros that can silently inject unintended null-statements.
* CALLOC_ARRAY() macro replaces many uses of xcalloc().
* Update insn in Makefile comments to run fuzz-all target.
* Fix a corner case bug in "git mv" on case insensitive systems,
which was introduced in 2.29 timeframe.
* We had a code to diagnose and die cleanly when a required
clean/smudge filter is missing, but an assert before that
unnecessarily fired, hiding the end-user facing die() message.
(merge 6fab35f748 mt/cleanly-die-upon-missing-required-filter later to maint).
* Update C code that sets a few configuration variables when a remote
is configured so that it spells configuration variable names in the
canonical camelCase.
(merge 0f1da600e6 ab/remote-write-config-in-camel-case later to maint).
* A new configuration variable has been introduced to allow choosing
which version of the generation number gets used in the
commit-graph file.
(merge 702110aac6 ds/commit-graph-generation-config later to maint).
* Perf test update to work better in secondary worktrees.
(merge 36e834abc1 jk/perf-in-worktrees later to maint).
* Updates to memory allocation code around the use of pcre2 library.
(merge c1760352e0 ab/grep-pcre2-allocfix later to maint).
* "git -c core.bare=false clone --bare ..." would have segfaulted,
which has been corrected.
(merge 75555676ad bc/clone-bare-with-conflicting-config later to maint).
* When "git checkout" removes a path that does not exist in the
commit it is checking out, it wasn't careful enough not to follow
symbolic links, which has been corrected.
(merge fab78a0c3d mt/checkout-remove-nofollow later to maint).
* A few option description strings started with capital letters,
which were corrected.
(merge 5ee90326dc cc/downcase-opt-help later to maint).
* Plug or annotate remaining leaks that trigger while running the
very basic set of tests.
(merge 68ffe095a2 ah/plugleaks later to maint).
* The hashwrite() API uses a buffering mechanism to avoid calling
write(2) too frequently. This logic has been refactored to be
easier to understand.
(merge ddaf1f62e3 ds/clarify-hashwrite later to maint).
* "git cherry-pick/revert" with or without "--[no-]edit" did not spawn
the editor as expected (e.g. "revert --no-edit" after a conflict
still asked to edit the message), which has been corrected.
(merge 39edfd5cbc en/sequencer-edit-upon-conflict-fix later to maint).
* "git daemon" has been tightened against systems that take backslash
as directory separator.
(merge 9a7f1ce8b7 rs/daemon-sanitize-dir-sep later to maint).
* A NULL-dereference bug has been corrected in an error codepath in
"git for-each-ref", "git branch --list" etc.
(merge c685450880 jk/ref-filter-segfault-fix later to maint).
* Streamline the codepath to fix the UTF-8 encoding issues in the
argv[] and the prefix on macOS.
(merge c7d0e61016 tb/precompose-prefix-simplify later to maint).
* The command-line completion script (in contrib/) had a couple of
references that would have given a warning under the "-u" (nounset)
option.
(merge c5c0548d79 vs/completion-with-set-u later to maint).
* When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.
(merge 8e118e8490 jk/pack-objects-bitmap-progress-fix later to maint).
* The dependencies for config-list.h and command-list.h were broken
when the former was split out of the latter, which has been
corrected.
(merge 56550ea718 sg/bugreport-fixes later to maint).
* "git push --quiet --set-upstream" was not quiet when setting the
upstream branch configuration, which has been corrected.
(merge f3cce896a8 ow/push-quiet-set-upstream later to maint).
* The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.
(merge 32f67888d8 ds/maintenance-prefetch-fix later to maint).
* Clarify that pathnames recorded in Git trees are most often (but
not necessarily) encoded in UTF-8.
(merge 9364bf465d ab/pathname-encoding-doc later to maint).
* "git --config-env var=val cmd" weren't accepted (only
--config-env=var=val was).
(merge c331551ccf ps/config-env-option-with-separate-value later to maint).
* When the reachability bitmap is in effect, the "do not lose
recently created objects and those that are reachable from them"
safety to protect us from races were disabled by mistake, which has
been corrected.
(merge 2ba582ba4c jk/prune-with-bitmap-fix later to maint).
* Cygwin pathname handling fix.
(merge bccc37fdc7 ad/cygwin-no-backslashes-in-paths later to maint).
* "git rebase --[no-]reschedule-failed-exec" did not work well with
its configuration variable, which has been corrected.
(merge e5b32bffd1 ab/rebase-no-reschedule-failed-exec later to maint).
* Portability fix for command line completion script (in contrib/).
(merge f2acf763e2 si/zsh-complete-comment-fix later to maint).
* "git repack -A -d" in a partial clone unnecessarily loosened
objects in promisor pack.
* "git bisect skip" when custom words are used for new/old did not
work, which has been corrected.
* A few variants of informational message "Already up-to-date" has
been rephrased.
(merge ad9322da03 js/merge-already-up-to-date-message-reword later to maint).
* "git submodule update --quiet" did not propagate the quiet option
down to underlying "git fetch", which has been corrected.
(merge 62af4bdd42 nc/submodule-update-quiet later to maint).
* Document that our test can use "local" keyword.
(merge a84fd3bcc6 jc/test-allows-local later to maint).
* The word-diff mode has been taught to work better with a word
regexp that can match an empty string.
(merge 0324e8fc6b pw/word-diff-zero-width-matches later to maint).
* "git p4" learned to find branch points more efficiently.
(merge 6b79818bfb jk/p4-locate-branch-point-optim later to maint).
* When "git update-ref -d" removes a ref that is packed, it left
empty directories under $GIT_DIR/refs/ for
(merge 5f03e5126d wc/packed-ref-removal-cleanup later to maint).
* "git clean" and "git ls-files -i" had confusion around working on
or showing ignored paths inside an ignored directory, which has
been corrected.
(merge b548f0f156 en/dir-traversal later to maint).
* The handling of "%(push)" formatting element of "for-each-ref" and
friends was broken when the same codepath started handling
"%(push:<what>)", which has been corrected.
(merge 1e1c4c5eac zh/ref-filter-push-remote-fix later to maint).
* The bash prompt script (in contrib/) did not work under "set -u".
(merge 5c0cbdb107 en/prompt-under-set-u later to maint).
* The "chainlint" feature in the test framework is a handy way to
catch common mistakes in writing new tests, but tends to get
expensive. An knob to selectively disable it has been introduced
to help running tests that the developer has not modified.
(merge 2d86a96220 jk/test-chainlint-softer later to maint).
* The "rev-parse" command did not diagnose the lack of argument to
"--path-format" option, which was introduced in v2.31 era, which
has been corrected.
(merge 99fc555188 wm/rev-parse-path-format-wo-arg later to maint).
* Other code cleanup, docfix, build fix, etc.
(merge f451960708 dl/cat-file-doc-cleanup later to maint).
(merge 12604a8d0c sv/t9801-test-path-is-file-cleanup later to maint).
(merge ea7e63921c jr/doc-ignore-typofix later to maint).
(merge 23c781f173 ps/update-ref-trans-hook-doc later to maint).
(merge 42efa1231a jk/filter-branch-sha256 later to maint).
(merge 4c8e3dca6e tb/push-simple-uses-branch-merge-config later to maint).
(merge 6534d436a2 bs/asciidoctor-installation-hints later to maint).
(merge 47957485b3 ab/read-tree later to maint).
(merge 2be927f3d1 ab/diff-no-index-tests later to maint).
(merge 76593c09bb ab/detox-gettext-tests later to maint).
(merge 28e29ee38b jc/doc-format-patch-clarify later to maint).
(merge fc12b6fdde fm/user-manual-use-preface later to maint).
(merge dba94e3a85 cc/test-helper-bloom-usage-fix later to maint).
(merge 61a7660516 hn/reftable-tables-doc-update later to maint).
(merge 81ed96a9b2 jt/fetch-pack-request-fix later to maint).
(merge 151b6c2dd7 jc/doc-do-not-capitalize-clarification later to maint).
(merge 9160068ac6 js/access-nul-emulation-on-windows later to maint).
(merge 7a14acdbe6 po/diff-patch-doc later to maint).
(merge f91371b948 pw/patience-diff-clean-up later to maint).
(merge 3a7f0908b6 mt/clean-clean later to maint).
(merge d4e2d15a8b ab/streaming-simplify later to maint).
(merge 0e59f7ad67 ah/merge-ort-i18n later to maint).
(merge e6f68f62e0 ls/typofix later to maint).

View File

@ -0,0 +1,279 @@
Git 2.33 Release Notes
======================
Updates since Git 2.32
----------------------
UI, Workflows & Features
* "git send-email" learned the "--sendmail-cmd" command line option
and the "sendemail.sendmailCmd" configuration variable, which is a
more sensible approach than the current way of repurposing the
"smtp-server" that is meant to name the server to instead name the
command to talk to the server.
* The userdiff pattern for C# learned the token "record".
* "git rev-list" learns to omit the "commit <object-name>" header
lines from the output with the `--no-commit-header` option.
* "git worktree add --lock" learned to record why the worktree is
locked with a custom message.
Performance, Internal Implementation, Development Support etc.
* The code to handle the "--format" option in "for-each-ref" and
friends made too many string comparisons on %(atom)s used in the
format string, which has been corrected by converting them into
enum when the format string is parsed.
* Use the hashfile API in the codepath that writes the index file to
reduce code duplication.
* Repeated rename detections in a sequence of mergy operations have
been optimized out for the 'ort' merge strategy.
* Preliminary clean-up of tests before the main reftable changes
hits the codebase.
* The backend for "diff -G/-S" has been updated to use pcre2 engine
when available.
* Use ".DELETE_ON_ERROR" pseudo target to simplify our Makefile.
* Code cleanup around struct_type_init() functions.
* "git send-email" optimization.
* GitHub Actions / CI update.
(merge 0dc787a9f2 js/ci-windows-update later to maint).
* Object accesses in repositories with many alternate object store
have been optimized.
* "git log" has been optimized not to waste cycles to load ref
decoration data that may not be needed.
* Many "printf"-like helper functions we have have been annotated
with __attribute__() to catch placeholder/parameter mismatches.
* Tests that cover protocol bits have been updated and helpers
used there have been consolidated.
* The CI gained a new job to run "make sparse" check.
* "git status" codepath learned to work with sparsely populated index
without hydrating it fully.
* A guideline for gender neutral documentation has been added.
* Documentation on "git diff -l<n>" and diff.renameLimit have been
updated, and the defaults for these limits have been raised.
* The completion support used to offer alternate spelling of options
that exist only for compatibility, which has been corrected.
* "TEST_OUTPUT_DIRECTORY=there make test" failed to work, which has
been corrected.
* "git bundle" gained more test coverage.
* "git read-tree" had a codepath where blobs are fetched one-by-one
from the promisor remote, which has been corrected to fetch in bulk.
* Rewrite of "git submodule" in C continues.
* "git checkout" and "git commit" learn to work without unnecessarily
expanding sparse indexes.
Fixes since v2.32
-----------------
* We historically rejected a very short string as an author name
while accepting a patch e-mail, which has been loosened.
(merge 72ee47ceeb ef/mailinfo-short-name later to maint).
* The parallel checkout codepath did not initialize object ID field
used to talk to the worker processes in a futureproof way.
* Rewrite code that triggers undefined behaviour warning.
(merge aafa5df0df jn/size-t-casted-to-off-t-fix later to maint).
* The description of "fast-forward" in the glossary has been updated.
(merge e22f2daed0 ry/clarify-fast-forward-in-glossary later to maint).
* Recent "git clone" left a temporary directory behind when the
transport layer returned an failure.
(merge 6aacb7d861 jk/clone-clean-upon-transport-error later to maint).
* "git fetch" over protocol v2 left its side of the socket open after
it finished speaking, which unnecessarily wasted the resource on
the other side.
(merge ae1a7eefff jk/fetch-pack-v2-half-close-early later to maint).
* The command line completion (in contrib/) learned that "git diff"
takes the "--anchored" option.
(merge d1e7c2cac9 tb/complete-diff-anchored later to maint).
* "git-svn" tests assumed that "locale -a", which is used to pick an
available UTF-8 locale, is available everywhere. A knob has been
introduced to allow testers to specify a suitable locale to use.
(merge 482c962de4 dd/svn-test-wo-locale-a later to maint).
* Update "git subtree" to work better on Windows.
(merge 77f37de39f js/subtree-on-windows-fix later to maint).
* Remove multimail from contrib/
(merge f74d11471f js/no-more-multimail later to maint).
* Make the codebase MSAN clean.
(merge 4dbc55e87d ah/uninitialized-reads-fix later to maint).
* Work around inefficient glob substitution in older versions of bash
by rewriting parts of a test.
(merge eb87c6f559 jx/t6020-with-older-bash later to maint).
* Avoid duplicated work while building reachability bitmaps.
(merge aa9ad6fee5 jk/bitmap-tree-optim later to maint).
* We broke "GIT_SKIP_TESTS=t?000" to skip certain tests in recent
update, which got fixed.
* The side-band demultiplexer that is used to display progress output
from the remote end did not clear the line properly when the end of
line hits at a packet boundary, which has been corrected.
* Some test scripts assumed that readlink(1) was universally
installed and available, which is not the case.
(merge 7c0afdf23c jk/test-without-readlink-1 later to maint).
* Recent update to completion script (in contrib/) broke those who
use the __git_complete helper to define completion to their custom
command.
(merge cea232194d fw/complete-cmd-idx-fix later to maint).
* Output from some of our tests were affected by the width of the
terminal that they were run in, which has been corrected by
exporting a fixed value in the COLUMNS environment.
(merge c49a177bec ab/fix-columns-to-80-during-tests later to maint).
* On Windows, mergetool has been taught to find kdiff3.exe just like
it finds winmerge.exe.
(merge 47eb4c6890 ms/mergetools-kdiff3-on-windows later to maint).
* When we cannot figure out how wide the terminal is, we use a
fallback value of 80 ourselves (which cannot be avoided), but when
we run the pager, we export it in COLUMNS, which forces the pager
to use the hardcoded value, even when the pager is perfectly
capable to figure it out itself. Stop exporting COLUMNS when we
fall back on the hardcoded default value for our own use.
(merge 9b6e2c8b98 js/stop-exporting-bogus-columns later to maint).
* "git cat-file --batch-all-objects"" misbehaved when "--batch" is in
use and did not ask for certain object traits.
(merge ee02ac6164 zh/cat-file-batch-fix later to maint).
* Some code and doc clarification around "git push".
* The "union" conflict resultion variant misbehaved when used with
binary merge driver.
(merge 382b601acd jk/union-merge-binary later to maint).
* Prevent "git p4" from failing to submit changes to binary file.
(merge 54662d5958 dc/p4-binary-submit-fix later to maint).
* "git grep --and -e foo" ought to have been diagnosed as an error
but instead segfaulted, which has been corrected.
(merge fe7fe62d8d rs/grep-parser-fix later to maint).
* The merge code had funny interactions between content based rename
detection and directory rename detection.
(merge 3585d0ea23 en/merge-dir-rename-corner-case-fix later to maint).
* When rebuilding the multi-pack index file reusing an existing one,
we used to blindly trust the existing file and ended up carrying
corrupted data into the updated file, which has been corrected.
(merge f89ecf7988 tb/midx-use-checksum later to maint).
* Update the location of system-side configuration file on Windows.
(merge e355307692 js/gfw-system-config-loc-fix later to maint).
* Code recently added to support common ancestry negotiation during
"git push" did not sanity check its arguments carefully enough.
(merge eff40457a4 ab/fetch-negotiate-segv-fix later to maint).
* Update the documentation not to assume users are of certain gender
and adds to guidelines to do so.
(merge 46a237f42f ds/gender-neutral-doc later to maint).
* "git commit --allow-empty-message" won't abort the operation upon
an empty message, but the hint shown in the editor said otherwise.
(merge 6f70f00b4f hj/commit-allow-empty-message later to maint).
* The code that gives an error message in "git multi-pack-index" when
no subcommand is given tried to print a NULL pointer as a strong,
which has been corrected.
(merge 88617d11f9 tb/reverse-midx later to maint).
* CI update.
(merge a066a90db6 js/ci-check-whitespace-updates later to maint).
* Documentation fix for "git pull --rebase=no".
(merge d3236becec fc/pull-no-rebase-merges-theirs-into-ours later to maint).
* A race between repacking and using pack bitmaps has been corrected.
(merge dc1daacdcc jk/check-pack-valid-before-opening-bitmap later to maint).
* The local changes stashed by "git merge --autostash" were lost when
the merge failed in certain ways, which has been corrected.
* Windows rmdir() equivalent behaves differently from POSIX ones in
that when used on a symbolic link that points at a directory, the
target directory gets removed, which has been corrected.
(merge 3e7d4888e5 tb/mingw-rmdir-symlink-to-directory later to maint).
* Other code cleanup, docfix, build fix, etc.
(merge bfe35a6165 ah/doc-describe later to maint).
(merge f302c1e4aa jc/clarify-revision-range later to maint).
(merge 3127ff90ea tl/fix-packfile-uri-doc later to maint).
(merge a84216c684 jk/doc-color-pager later to maint).
(merge 4e0a64a713 ab/trace2-squelch-gcc-warning later to maint).
(merge 225f7fa847 ps/rev-list-object-type-filter later to maint).
(merge 5317dfeaed dd/honor-users-tar-in-tests later to maint).
(merge ace6d8e3d6 tk/partial-clone-repack-doc later to maint).
(merge 7ba68e0cf1 js/trace2-discard-event-docfix later to maint).
(merge 8603c419d3 fc/doc-default-to-upstream-config later to maint).
(merge 1d72b604ef jk/revision-squelch-gcc-warning later to maint).
(merge abcb66c614 ar/typofix later to maint).
(merge 9853830787 ah/graph-typofix later to maint).
(merge aac578492d ab/config-hooks-path-testfix later to maint).
(merge 98c7656a18 ar/more-typofix later to maint).
(merge 6fb9195f6c jk/doc-max-pack-size later to maint).
(merge 4184cbd635 ar/mailinfo-memcmp-to-skip-prefix later to maint).
(merge 91d2347033 ar/doc-libera-chat-in-my-first-contrib later to maint).
(merge 338abb0f04 ab/cmd-foo-should-return later to maint).
(merge 546096a5cb ab/xdiff-bug-cleanup later to maint).
(merge b7b793d1e7 ab/progress-cleanup later to maint).
(merge d94f9b8e90 ba/object-info later to maint).
(merge 52ff891c03 ar/test-code-cleanup later to maint).
(merge a0538e5c8b dd/document-log-decorate-default later to maint).
(merge ce24797d38 mr/cmake later to maint).
(merge 9eb542f2ee ab/pre-auto-gc-hook-test later to maint).
(merge 9fffc38583 bk/doc-commit-typofix later to maint).
(merge 1cf823d8f0 ks/submodule-cleanup later to maint).
(merge ebbf5d2b70 js/config-mak-windows-pcre-fix later to maint).
(merge 617480d75b hn/refs-iterator-peel-returns-boolean later to maint).
(merge 6a24cc71ed ar/submodule-helper-include-cleanup later to maint).
(merge 5632e838f8 rs/khash-alloc-cleanup later to maint).
(merge b1d87fbaf1 jk/typofix later to maint).
(merge e04170697a ab/gitignore-discovery-doc later to maint).
(merge 8232a0ff48 dl/packet-read-response-end-fix later to maint).
(merge eb448631fb dl/diff-merge-base later to maint).
(merge c510928a25 hn/refs-debug-empty-prefix later to maint).
(merge ddcb189d9d tb/bitmap-type-filter-comment-fix later to maint).
(merge 878b399734 pb/submodule-recurse-doc later to maint).
(merge 734283855f jk/config-env-doc later to maint).
(merge 482e1488a9 ab/getcwd-test later to maint).
(merge f0b922473e ar/doc-markup-fix later to maint).

View File

@ -0,0 +1,138 @@
Git 2.33.1 Release Notes
========================
This primarily is to backport various fixes accumulated during the
development towards Git 2.34, the next feature release.
Fixes since v2.33
-----------------
* The unicode character width table (used for output alignment) has
been updated.
* Input validation of "git pack-objects --stdin-packs" has been
corrected.
* Bugfix for common ancestor negotiation recently introduced in "git
push" codepath.
* "git pull" had various corner cases that were not well thought out
around its --rebase backend, e.g. "git pull --ff-only" did not stop
but went ahead and rebased when the history on other side is not a
descendant of our history. The series tries to fix them up.
* "git apply" miscounted the bytes and failed to read to the end of
binary hunks.
* "git range-diff" code clean-up.
* "git commit --fixup" now works with "--edit" again, after it was
broken in v2.32.
* Use upload-artifacts v1 (instead of v2) for 32-bit linux, as the
new version has a blocker bug for that architecture.
* Checking out all the paths from HEAD during the last conflicted
step in "git rebase" and continuing would cause the step to be
skipped (which is expected), but leaves MERGE_MSG file behind in
$GIT_DIR and confuses the next "git commit", which has been
corrected.
* Various bugs in "git rebase -r" have been fixed.
* mmap() imitation used to call xmalloc() that dies upon malloc()
failure, which has been corrected to just return an error to the
caller to be handled.
* "git diff --relative" segfaulted and/or produced incorrect result
when there are unmerged paths.
* The delayed checkout code path in "git checkout" etc. were chatty
even when --quiet and/or --no-progress options were given.
* "git branch -D <branch>" used to refuse to remove a broken branch
ref that points at a missing commit, which has been corrected.
* Build update for Apple clang.
* The parser for the "--nl" option of "git column" has been
corrected.
* "git upload-pack" which runs on the other side of "git fetch"
forgot to take the ref namespaces into account when handling
want-ref requests.
* The sparse-index support can corrupt the index structure by storing
a stale and/or uninitialized data, which has been corrected.
* Buggy tests could damage repositories outside the throw-away test
area we created. We now by default export GIT_CEILING_DIRECTORIES
to limit the damage from such a stray test.
* Even when running "git send-email" without its own threaded
discussion support, a threading related header in one message is
carried over to the subsequent message to result in an unwanted
threading, which has been corrected.
* The output from "git fast-export", when its anonymization feature
is in use, showed an annotated tag incorrectly.
* Recent "diff -m" changes broke "gitk", which has been corrected.
* "git maintenance" scheduler fix for macOS.
* A pathname in an advice message has been made cut-and-paste ready.
* The "git apply -3" code path learned not to bother the lower level
merge machinery when the three-way merge can be trivially resolved
without the content level merge.
* The code that optionally creates the *.rev reverse index file has
been optimized to avoid needless computation when it is not writing
the file out.
* "git range-diff -I... <range> <range>" segfaulted, which has been
corrected.
* The order in which various files that make up a single (conceptual)
packfile has been reevaluated and straightened up. This matters in
correctness, as an incomplete set of files must not be shown to a
running Git.
* The "mode" word is useless in a call to open(2) that does not
create a new file. Such a call in the files backend of the ref
subsystem has been cleaned up.
* "git update-ref --stdin" failed to flush its output as needed,
which potentially led the conversation to a deadlock.
* When "git am --abort" fails to abort correctly, it still exited
with exit status of 0, which has been corrected.
* Correct nr and alloc members of strvec struct to be of type size_t.
* "git stash", where the tentative change involves changing a
directory to a file (or vice versa), was confused, which has been
corrected.
* "git clone" from a repository whose HEAD is unborn into a bare
repository didn't follow the branch name the other side used, which
is corrected.
* "git cvsserver" had a long-standing bug in its authentication code,
which has finally been corrected (it is unclear and is a separate
question if anybody is seriously using it, though).
* "git difftool --dir-diff" mishandled symbolic links.
* Sensitive data in the HTTP trace were supposed to be redacted, but
we failed to do so in HTTP/2 requests.
* "make clean" has been updated to remove leftover .depend/
directories, even when it is not told to use them to compute header
dependencies.
* Protocol v0 clients can get stuck parsing a malformed feature line.
Also contains various documentation updates and code clean-ups.

View File

@ -0,0 +1,438 @@
Git 2.34 Release Notes
======================
Updates since Git 2.33
----------------------
Backward compatibility notes
* The "--preserve-merges" option of "git rebase" has been removed.
UI, Workflows & Features
* Pathname expansion (like "~username/") learned a way to specify a
location relative to Git installation (e.g. its $sharedir which is
$(prefix)/share), with "%(prefix)".
* The `ort` strategy is used instead of `recursive` as the default
merge strategy.
* The userdiff pattern for "java" language has been updated.
* "git rebase" by default skips changes that are equivalent to
commits that are already in the history the branch is rebased onto;
give messages when this happens to let the users be aware of
skipped commits, and also teach them how to tell "rebase" to keep
duplicated changes.
* The advice message that "git cherry-pick" gives when it asks
conflicted replay of a commit to be resolved by the end user has
been updated.
* After "git clone --recurse-submodules", all submodules are cloned
but they are not by default recursed into by other commands. With
submodule.stickyRecursiveClone configuration set, submodule.recurse
configuration is set to true in a repository created by "clone"
with "--recurse-submodules" option.
* The logic for auto-correction of misspelt subcommands learned to go
interactive when the help.autocorrect configuration variable is set
to 'prompt'.
* "git maintenance" scheduler learned to use systemd timers as a
possible backend.
* "git diff --submodule=diff" showed failure from run_command() when
trying to run diff inside a submodule, when the user manually
removes the submodule directory.
* "git bundle unbundle" learned to show progress display.
* In cone mode, the sparse-index code path learned to remove ignored
files (like build artifacts) outside the sparse cone, allowing the
entire directory outside the sparse cone to be removed, which is
especially useful when the sparse patterns change.
* Taking advantage of the CGI interface, http-backend has been
updated to enable protocol v2 automatically when the other side
asks for it.
* The credential-cache helper has been adjusted to Windows.
* The error in "git help no-such-git-command" is handled better.
* The unicode character width table (used for output alignment) has
been updated.
* The ref iteration code used to optionally allow dangling refs to be
shown, which has been tightened up.
* "git add", "git mv", and "git rm" have been adjusted to avoid
updating paths outside of the sparse-checkout definition unless
the user specifies a "--sparse" option.
* "git repack" has been taught to generate multi-pack reachability
bitmaps.
* "git fsck" has been taught to report mismatch between expected and
actual types of an object better.
* In addition to GnuPG, ssh public crypto can be used for object and
push-cert signing. Note that this feature cannot be used with
ssh-keygen from OpenSSH 8.7, whose support for it is broken. Avoid
using it unless you update to OpenSSH 8.8.
* "git log --grep=string --author=name" learns to highlight hits just
like "git grep string" does.
Performance, Internal Implementation, Development Support etc.
* "git bisect" spawned "git show-branch" only to pretty-print the
title of the commit after checking out the next version to be
tested; this has been rewritten in C.
* "git add" can work better with the sparse index.
* Support for ancient versions of cURL library (pre 7.19.4) has been
dropped.
* A handful of tests that assumed implementation details of files
backend for refs have been cleaned up.
* trace2 logs learned to show parent process name to see in what
context Git was invoked.
* Loading of ref tips to prepare for common ancestry negotiation in
"git fetch-pack" has been optimized by taking advantage of the
commit graph when available.
* Remind developers that the userdiff patterns should be kept simple
and permissive, assuming that the contents they apply are always
syntactically correct.
* The current implementation of GIT_TEST_FAIL_PREREQS is broken in
that checking for the lack of a prerequisite would not work. Avoid
the use of "if ! test_have_prereq X" in a test script.
* The revision traversal API has been optimized by taking advantage
of the commit-graph, when available, to determine if a commit is
reachable from any of the existing refs.
* "git fetch --quiet" optimization to avoid useless computation of
info that will never be displayed.
* Callers from older advice_config[] based API has been updated to
use the newer advice_if_enabled() and advice_enabled() API.
* Teach "test_pause" and "debug" helpers to allow using the HOME and
TERM environment variables the user usually uses.
* "make INSTALL_STRIP=-s install" allows the installation step to use
"install -s" to strip the binaries as they get installed.
* Code that handles large number of refs in the "git fetch" code
path has been optimized.
* The reachability bitmap file used to be generated only for a single
pack, but now we've learned to generate bitmaps for history that
span across multiple packfiles.
* The code to make "git grep" recurse into submodules has been
updated to migrate away from the "add submodule's object store as
an alternate object store" mechanism (which is suboptimal).
* The tracing of process ancestry information has been enhanced.
* Reduce number of write(2) system calls while sending the
ref advertisement.
* Update the build procedure to use the "-pedantic" build when
DEVELOPER makefile macro is in effect.
* Large part of "git submodule add" gets rewritten in C.
* The run-command API has been updated so that the callers can easily
ask the file descriptors open for packfiles to be closed immediately
before spawning commands that may trigger auto-gc.
* An oddball OPTION_ARGUMENT feature has been removed from the
parse-options API.
* The mergesort implementation used to sort linked list has been
optimized.
* Remove external declaration of functions that no longer exist.
* "git multi-pack-index write --bitmap" learns to propagate the
hashcache from original bitmap to resulting bitmap.
* CI learns to run the leak sanitizer builds.
* "git grep --recurse-submodules" takes trees and blobs from the
submodule repository, but the textconv settings when processing a
blob from the submodule is not taken from the submodule repository.
A test is added to demonstrate the issue, without fixing it.
* Teach "git help -c" into helping the command line completion of
configuration variables.
* When "git cmd -h" shows more than one line of usage text (e.g.
the cmd subcommand may take sub-sub-command), parse-options API
learned to align these lines, even across i18n/l10n.
* Prevent "make sparse" from running for the source files that
haven't been modified.
* The code path to write a new version of .midx multi-pack index files
has learned to release the mmaped memory holding the current
version of .midx before removing them from the disk, as some
platforms do not allow removal of a file that still has mapping.
* A new feature has been added to abort early in the test framework.
Fixes since v2.33
-----------------
* Input validation of "git pack-objects --stdin-packs" has been
corrected.
* Bugfix for common ancestor negotiation recently introduced in "git
push" code path.
* "git pull" had various corner cases that were not well thought out
around its --rebase backend, e.g. "git pull --ff-only" did not stop
but went ahead and rebased when the history on other side is not a
descendant of our history. The series tries to fix them up.
* "git apply" miscounted the bytes and failed to read to the end of
binary hunks.
* "git range-diff" code clean-up.
* "git commit --fixup" now works with "--edit" again, after it was
broken in v2.32.
* Use upload-artifacts v1 (instead of v2) for 32-bit linux, as the
new version has a blocker bug for that architecture.
* Checking out all the paths from HEAD during the last conflicted
step in "git rebase" and continuing would cause the step to be
skipped (which is expected), but leaves MERGE_MSG file behind in
$GIT_DIR and confuses the next "git commit", which has been
corrected.
* Various bugs in "git rebase -r" have been fixed.
* mmap() imitation used to call xmalloc() that dies upon malloc()
failure, which has been corrected to just return an error to the
caller to be handled.
* "git diff --relative" segfaulted and/or produced incorrect result
when there are unmerged paths.
* The delayed checkout code path in "git checkout" etc. were chatty
even when --quiet and/or --no-progress options were given.
* "git branch -D <branch>" used to refuse to remove a broken branch
ref that points at a missing commit, which has been corrected.
* Build update for Apple clang.
* The parser for the "--nl" option of "git column" has been
corrected.
* "git upload-pack" which runs on the other side of "git fetch"
forgot to take the ref namespaces into account when handling
want-ref requests.
* The sparse-index support can corrupt the index structure by storing
a stale and/or uninitialized data, which has been corrected.
* Buggy tests could damage repositories outside the throw-away test
area we created. We now by default export GIT_CEILING_DIRECTORIES
to limit the damage from such a stray test.
* Even when running "git send-email" without its own threaded
discussion support, a threading related header in one message is
carried over to the subsequent message to result in an unwanted
threading, which has been corrected.
* The output from "git fast-export", when its anonymization feature
is in use, showed an annotated tag incorrectly.
* Recent "diff -m" changes broke "gitk", which has been corrected.
* The "git apply -3" code path learned not to bother the lower level
merge machinery when the three-way merge can be trivially resolved
without the content level merge. This fixes a regression caused by
recent "-3way first and fall back to direct application" change.
* The code that optionally creates the *.rev reverse index file has
been optimized to avoid needless computation when it is not writing
the file out.
* "git range-diff -I... <range> <range>" segfaulted, which has been
corrected.
* The order in which various files that make up a single (conceptual)
packfile has been reevaluated and straightened up. This matters in
correctness, as an incomplete set of files must not be shown to a
running Git.
* The "mode" word is useless in a call to open(2) that does not
create a new file. Such a call in the files backend of the ref
subsystem has been cleaned up.
* "git update-ref --stdin" failed to flush its output as needed,
which potentially led the conversation to a deadlock.
* When "git am --abort" fails to abort correctly, it still exited
with exit status of 0, which has been corrected.
* Correct nr and alloc members of strvec struct to be of type size_t.
* "git stash", where the tentative change involves changing a
directory to a file (or vice versa), was confused, which has been
corrected.
* "git clone" from a repository whose HEAD is unborn into a bare
repository didn't follow the branch name the other side used, which
is corrected.
* "git cvsserver" had a long-standing bug in its authentication code,
which has finally been corrected (it is unclear and is a separate
question if anybody is seriously using it, though).
* "git difftool --dir-diff" mishandled symbolic links.
* Sensitive data in the HTTP trace were supposed to be redacted, but
we failed to do so in HTTP/2 requests.
* "make clean" has been updated to remove leftover .depend/
directories, even when it is not told to use them to compute header
dependencies.
* Protocol v0 clients can get stuck parsing a malformed feature line.
* A few kinds of changes "git status" can show were not documented.
(merge d2a534c515 ja/doc-status-types-and-copies later to maint).
* The mergesort implementation used to sort linked list has been
optimized.
(merge c90cfc225b rs/mergesort later to maint).
* An editor session launched during a Git operation (e.g. during 'git
commit') can leave the terminal in a funny state. The code path
has updated to save the terminal state before, and restore it
after, it spawns an editor.
(merge 3d411afabc cm/save-restore-terminal later to maint).
* "git cat-file --batch" with the "--batch-all-objects" option is
supposed to iterate over all the objects found in a repository, but
it used to translate these object names using the replace mechanism,
which defeats the point of enumerating all objects in the repository.
This has been corrected.
(merge bf972896d7 jk/cat-file-batch-all-wo-replace later to maint).
* Recent sparse-index work broke safety against attempts to add paths
with trailing slashes to the index, which has been corrected.
(merge c8ad9d04c6 rs/make-verify-path-really-verify-again later to maint).
* The "--color-lines" and "--color-by-age" options of "git blame"
have been missing, which are now documented.
(merge 8c32856133 bs/doc-blame-color-lines later to maint).
* The PATH used in CI job may be too wide and let incompatible dlls
to be grabbed, which can cause the build&test to fail. Tighten it.
(merge 7491ef6198 js/windows-ci-path-fix later to maint).
* Avoid performance measurements from getting ruined by gc and other
housekeeping pauses interfering in the middle.
(merge be79131a53 rs/disable-gc-during-perf-tests later to maint).
* Stop "git add --dry-run" from creating new blob and tree objects.
(merge e578d0311d rs/add-dry-run-without-objects later to maint).
* "git commit" gave duplicated error message when the object store
was unwritable, which has been corrected.
(merge 4ef91a2d79 ab/fix-commit-error-message-upon-unwritable-object-store later to maint).
* Recent sparse-index addition, namely any use of index_name_pos(),
can expand sparse index entries and breaks any code that walks
cache-tree or existing index entries. One such instance of such a
breakage has been corrected.
* The xxdiff difftool backend can exit with status 128, which the
difftool-helper that launches the backend takes as a significant
failure, when it is not significant at all. Work it around.
(merge 571f4348dd da/mergetools-special-case-xxdiff-exit-128 later to maint).
* Improve test framework around unwritable directories.
(merge 5d22e18965 ab/test-cleanly-recreate-trash-directory later to maint).
* "git push" client talking to an HTTP server did not diagnose the
lack of the final status report from the other side correctly,
which has been corrected.
(merge c5c3486f38 jk/http-push-status-fix later to maint).
* Update "git archive" documentation and give explicit mention on the
compression level for both zip and tar.gz format.
(merge c4b208c309 bs/archive-doc-compression-level later to maint).
* Drop "git sparse-checkout" from the list of common commands.
(merge 6a9a50a8af sg/sparse-index-not-that-common-a-command later to maint).
* "git branch -c/-m new old" was not described to copy config, which
has been corrected.
(merge 8252ec300e jc/branch-copy-doc later to maint).
* Squelch over-eager warning message added during this cycle.
* Fix long-standing shell syntax error in the completion script.
(merge 46b0585286 re/completion-fix-test-equality later to maint).
* Teach "git commit-graph" command not to allow using replace objects
at all, as we do not use the commit-graph at runtime when we see
object replacement.
(merge 095d112f8c ab/ignore-replace-while-working-on-commit-graph later to maint).
* "git pull --no-verify" did not affect the underlying "git merge".
(merge 47bfdfb3fd ar/fix-git-pull-no-verify later to maint).
* One CI task based on Fedora image noticed a not-quite-kosher
construct recently, which has been corrected.
* "git pull --ff-only" and "git pull --rebase --ff-only" should make
it a no-op to attempt pulling from a remote that is behind us, but
instead the command errored out by saying it was impossible to
fast-forward, which may technically be true, but not a useful thing
to diagnose as an error. This has been corrected.
(merge 361cb52383 jc/fix-pull-ff-only-when-already-up-to-date later to maint).
* The way Cygwin emulates a unix-domain socket, on top of which the
simple-ipc mechanism is implemented, can race with the program on
the other side that wants to use the socket, and briefly make it
appear as a regular file before lstat(2) starts reporting it as a
socket. We now have a workaround on the side that connects to a
unix domain socket.
* Other code cleanup, docfix, build fix, etc.
(merge f188160be9 ab/bundle-remove-verbose-option later to maint).
(merge 8c6b4332b4 rs/close-pack-leakfix later to maint).
(merge 51b04c05b7 bs/difftool-msg-tweak later to maint).
(merge dd20e4a6db ab/make-compdb-fix later to maint).
(merge 6ffb990dc4 os/status-docfix later to maint).
(merge 100c2da2d3 rs/p3400-lose-tac later to maint).
(merge 76f3b69896 tb/aggregate-ignore-leading-whitespaces later to maint).
(merge 6e4fd8bfcd tz/doc-link-to-bundle-format-fix later to maint).
(merge f6c013dfa1 jc/doc-commit-header-continuation-line later to maint).
(merge ec9a37d69b ab/pkt-line-cleanup later to maint).
(merge 8650c6298c ab/fix-make-lint-docs later to maint).
(merge 1c720357ce ab/test-lib-diff-cleanup later to maint).
(merge 6b615dbece ks/submodule-add-message-fix later to maint).
(merge 203eb8381a jc/doc-format-patch-clarify-auto-base later to maint).
(merge 559664c792 ab/test-lib later to maint).

View File

@ -0,0 +1,23 @@
Git v2.34.1 Release Notes
=========================
This release is primarily to fix a handful of regressions in Git 2.34.
Fixes since v2.34
-----------------
* "git grep" looking in a blob that has non-UTF8 payload was
completely broken when linked with certain versions of PCREv2
library in the latest release.
* "git pull" with any strategy when the other side is behind us
should succeed as it is a no-op, but doesn't.
* An earlier change in 2.34.0 caused JGit application (that abused
GIT_EDITOR mechanism when invoking "git config") to get stuck with
a SIGTTOU signal; it has been reverted.
* An earlier change that broke .gitignore matching has been reverted.
* SubmittingPatches document gained a syntactically incorrect mark-up,
which has been corrected.

View File

@ -377,7 +377,7 @@ notes for details).
on that order.
* "git show 'HEAD:Foo[BAR]Baz'" did not interpret the argument as a
rev, i.e. the object named by the the pathname with wildcard
rev, i.e. the object named by the pathname with wildcard
characters in a tree object.
(merge aac4fac nd/dwim-wildcards-as-pathspecs later to maint).

View File

@ -74,10 +74,9 @@ the feature triggers the new behavior when it should, and to show the
feature does not trigger when it shouldn't. After any code change, make
sure that the entire test suite passes.
If you have an account at GitHub (and you can get one for free to work
on open source projects), you can use their Travis CI integration to
test your changes on Linux, Mac (and hopefully soon Windows). See
GitHub-Travis CI hints section for details.
Pushing to a fork of https://github.com/git/git will use their CI
integration to test your changes on Linux, Mac and Windows. See the
<<GHCI,GitHub CI>> section for details.
Do not forget to update the documentation to describe the updated
behavior and make sure that the resulting documentation set formats
@ -117,10 +116,13 @@ If in doubt which identifier to use, run `git log --no-merges` on the
files you are modifying to see the current conventions.
[[summary-section]]
It's customary to start the remainder of the first line after "area: "
with a lower-case letter. E.g. "doc: clarify...", not "doc:
Clarify...", or "githooks.txt: improve...", not "githooks.txt:
Improve...".
The title sentence after the "area:" prefix omits the full stop at the
end, and its first word is not capitalized unless there is a reason to
capitalize it other than because it is the first word in the sentence.
E.g. "doc: clarify...", not "doc: Clarify...", or "githooks.txt:
improve...", not "githooks.txt: Improve...". But "refs: HEAD is also
treated as a ref" is correct, as we spell `HEAD` in all caps even when
it appears in the middle of a sentence.
[[meaningful-message]]
The body should provide a meaningful commit message, which:
@ -164,6 +166,85 @@ or, on an older version of Git without support for --pretty=reference:
git show -s --date=short --pretty='format:%h (%s, %ad)' <commit>
....
[[sign-off]]
=== Certify your work by adding your `Signed-off-by` trailer
To improve tracking of who did what, we ask you to certify that you
wrote the patch or have the right to pass it on under the same license
as ours, by "signing off" your patch. Without sign-off, we cannot
accept your patches.
If (and only if) you certify the below D-C-O:
[[dco]]
.Developer's Certificate of Origin 1.1
____
By making a contribution to this project, I certify that:
a. The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
b. The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
c. The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
d. I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
____
you add a "Signed-off-by" trailer to your commit, that looks like
this:
....
Signed-off-by: Random J Developer <random@developer.example.org>
....
This line can be added by Git if you run the git-commit command with
the -s option.
Notice that you can place your own `Signed-off-by` trailer when
forwarding somebody else's patch with the above rules for
D-C-O. Indeed you are encouraged to do so. Do not forget to
place an in-body "From: " line at the beginning to properly attribute
the change to its true author (see (2) above).
This procedure originally came from the Linux kernel project, so our
rule is quite similar to theirs, but what exactly it means to sign-off
your patch differs from project to project, so it may be different
from that of the project you are accustomed to.
[[real-name]]
Also notice that a real name is used in the `Signed-off-by` trailer. Please
don't hide your real name.
[[commit-trailers]]
If you like, you can put extra tags at the end:
. `Reported-by:` is used to credit someone who found the bug that
the patch attempts to fix.
. `Acked-by:` says that the person who is more familiar with the area
the patch attempts to modify liked the patch.
. `Reviewed-by:`, unlike the other tags, can only be offered by the
reviewers themselves when they are completely satisfied with the
patch after a detailed analysis.
. `Tested-by:` is used to indicate that the person applied the patch
and found it to have the desired effect.
You can also create your own tag or use one that's in common usage
such as "Thanks-to:", "Based-on-patch-by:", or "Mentored-by:".
[[git-tools]]
=== Generate your patch using Git tools out of your commits.
@ -299,86 +380,6 @@ Do not forget to add trailers such as `Acked-by:`, `Reviewed-by:` and
`Tested-by:` lines as necessary to credit people who helped your
patch, and "cc:" them when sending such a final version for inclusion.
[[sign-off]]
=== Certify your work by adding your `Signed-off-by` trailer
To improve tracking of who did what, we ask you to certify that you
wrote the patch or have the right to pass it on under the same license
as ours, by "signing off" your patch. Without sign-off, we cannot
accept your patches.
If (and only if) you certify the below D-C-O:
[[dco]]
.Developer's Certificate of Origin 1.1
____
By making a contribution to this project, I certify that:
a. The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
b. The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
c. The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
d. I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
____
you add a "Signed-off-by" trailer to your commit, that looks like
this:
....
Signed-off-by: Random J Developer <random@developer.example.org>
....
This line can be added by Git if you run the git-commit command with
the -s option.
Notice that you can place your own `Signed-off-by` trailer when
forwarding somebody else's patch with the above rules for
D-C-O. Indeed you are encouraged to do so. Do not forget to
place an in-body "From: " line at the beginning to properly attribute
the change to its true author (see (2) above).
This procedure originally came from the Linux kernel project, so our
rule is quite similar to theirs, but what exactly it means to sign-off
your patch differs from project to project, so it may be different
from that of the project you are accustomed to.
[[real-name]]
Also notice that a real name is used in the `Signed-off-by` trailer. Please
don't hide your real name.
[[commit-trailers]]
If you like, you can put extra tags at the end:
. `Reported-by:` is used to credit someone who found the bug that
the patch attempts to fix.
. `Acked-by:` says that the person who is more familiar with the area
the patch attempts to modify liked the patch.
. `Reviewed-by:`, unlike the other tags, can only be offered by the
reviewer and means that she is completely satisfied that the patch
is ready for application. It is usually offered only after a
detailed review.
. `Tested-by:` is used to indicate that the person applied the patch
and found it to have the desired effect.
You can also create your own tag or use one that's in common usage
such as "Thanks-to:", "Based-on-patch-by:", or "Mentored-by:".
== Subsystems with dedicated maintainers
Some parts of the system have dedicated maintainers with their own
@ -447,13 +448,12 @@ their trees themselves.
entitled "What's cooking in git.git" and "What's in git.git" giving
the status of various proposed changes.
[[travis]]
== GitHub-Travis CI hints
== GitHub CI[[GHCI]]
With an account at GitHub (you can get one for free to work on open
source projects), you can use Travis CI to test your changes on Linux,
Mac (and hopefully soon Windows). You can find a successful example
test build here: https://travis-ci.org/git/git/builds/120473209
With an account at GitHub, you can use GitHub CI to test your changes
on Linux, Mac and Windows. See
https://github.com/git/git/actions/workflows/main.yml for examples of
recent CI runs.
Follow these steps for the initial setup:
@ -461,31 +461,18 @@ Follow these steps for the initial setup:
You can find detailed instructions how to fork here:
https://help.github.com/articles/fork-a-repo/
. Open the Travis CI website: https://travis-ci.org
. Press the "Sign in with GitHub" button.
. Grant Travis CI permissions to access your GitHub account.
You can find more information about the required permissions here:
https://docs.travis-ci.com/user/github-oauth-scopes
. Open your Travis CI profile page: https://travis-ci.org/profile
. Enable Travis CI builds for your Git fork.
After the initial setup, Travis CI will run whenever you push new changes
After the initial setup, CI will run whenever you push new changes
to your fork of Git on GitHub. You can monitor the test state of all your
branches here: https://travis-ci.org/__<Your GitHub handle>__/git/branches
branches here: `https://github.com/<Your GitHub handle>/git/actions/workflows/main.yml`
If a branch did not pass all test cases then it is marked with a red
cross. In that case you can click on the failing Travis CI job and
scroll all the way down in the log. Find the line "<-- Click here to see
detailed test output!" and click on the triangle next to the log line
number to expand the detailed test output. Here is such a failing
example: https://travis-ci.org/git/git/jobs/122676187
cross. In that case you can click on the failing job and navigate to
"ci/run-build-and-tests.sh" and/or "ci/print-test-failures.sh". You
can also download "Artifacts" which are tarred (or zipped) archives
with test data relevant for debugging.
Fix the problem and push your fix to your Git fork. This will trigger
a new Travis CI build to ensure all tests pass.
Then fix the problem and push your fix to your GitHub fork. This will
trigger a new CI build to ensure all tests pass.
[[mua]]
== MUA specific hints

View File

@ -1,6 +1,6 @@
-b::
Show blank SHA-1 for boundary commits. This can also
be controlled via the `blame.blankboundary` config option.
be controlled via the `blame.blankBoundary` config option.
--root::
Do not treat root commits as boundaries. This can also be
@ -136,5 +136,16 @@ take effect.
option. An empty file name, `""`, will clear the list of revs from
previously processed files.
--color-lines::
Color line annotations in the default format differently if they come from
the same commit as the preceding line. This makes it easier to distinguish
code blocks introduced by different commits. The color defaults to cyan and
can be adjusted using the `color.blame.repeatedLines` config option.
--color-by-age::
Color line annotations depending on the age of the line in the default format.
The `color.blame.highlightRecent` config option controls what color is used for
each range of age.
-h::
Show help message.

View File

@ -46,7 +46,7 @@ Subsection names are case sensitive and can contain any characters except
newline and the null byte. Doublequote `"` and backslash can be included
by escaping them as `\"` and `\\`, respectively. Backslashes preceding
other characters are dropped when reading; for example, `\t` is read as
`t` and `\0` is read as `0` Section headers cannot span multiple lines.
`t` and `\0` is read as `0`. Section headers cannot span multiple lines.
Variables may belong directly to a section or to a given subsection. You
can have `[section]` if you have `[section "subsection"]`, but you don't
need to.
@ -298,6 +298,15 @@ pathname::
tilde expansion happens to such a string: `~/`
is expanded to the value of `$HOME`, and `~user/` to the
specified user's home directory.
+
If a path starts with `%(prefix)/`, the remainder is interpreted as a
path relative to Git's "runtime prefix", i.e. relative to the location
where Git itself was installed. For example, `%(prefix)/bin/` refers to
the directory in which the Git executable itself lives. If Git was
compiled without runtime prefix support, the compiled-in prefix will be
substituted instead. In the unlikely event that a literal path needs to
be specified that should _not_ be expanded, it needs to be prefixed by
`./`, like so: `./%(prefix)/bin`.
Variables
@ -398,6 +407,8 @@ include::config/interactive.txt[]
include::config/log.txt[]
include::config/lsrefs.txt[]
include::config/mailinfo.txt[]
include::config/mailmap.txt[]
@ -438,8 +449,6 @@ include::config/rerere.txt[]
include::config/reset.txt[]
include::config/safe.txt[]
include::config/sendemail.txt[]
include::config/sequencer.txt[]

View File

@ -44,6 +44,9 @@ advice.*::
Shown when linkgit:git-push[1] rejects a forced update of
a branch when its remote-tracking ref has updates that we
do not have locally.
skippedCherryPicks::
Shown when linkgit:git-rebase[1] skips a commit that has already
been cherry-picked onto the upstream branch.
statusAheadBehind::
Shown when linkgit:git-status[1] computes the ahead/behind
counts for a local ref compared to its remote tracking ref,
@ -119,4 +122,8 @@ advice.*::
addEmptyPathspec::
Advice shown if a user runs the add command without providing
the pathspec parameter.
updateSparsePath::
Advice shown when either linkgit:git-add[1] or linkgit:git-rm[1]
is asked to update index entries outside the current sparse
checkout.
--

View File

@ -27,7 +27,7 @@ blame.ignoreRevsFile::
file names will reset the list of ignored revisions. This option will
be handled before the command line option `--ignore-revs-file`.
blame.markUnblamables::
blame.markUnblamableLines::
Mark lines that were changed by an ignored revision that we could not
attribute to another commit with a '*' in the output of
linkgit:git-blame[1].

View File

@ -85,10 +85,6 @@ When `merges` (or just 'm'), pass the `--rebase-merges` option to 'git rebase'
so that the local merge commits are included in the rebase (see
linkgit:git-rebase[1] for details).
+
When `preserve` (or just 'p', deprecated in favor of `merges`), also pass
`--preserve-merges` along to 'git rebase' so that locally committed merge
commits will not be flattened by running 'git pull'.
+
When the value is `interactive` (or just 'i'), the rebase is run in interactive
mode.
+

View File

@ -21,3 +21,24 @@ checkout.guess::
Provides the default value for the `--guess` or `--no-guess`
option in `git checkout` and `git switch`. See
linkgit:git-switch[1] and linkgit:git-checkout[1].
checkout.workers::
The number of parallel workers to use when updating the working tree.
The default is one, i.e. sequential execution. If set to a value less
than one, Git will use as many workers as the number of logical cores
available. This setting and `checkout.thresholdForParallelism` affect
all commands that perform checkout. E.g. checkout, clone, reset,
sparse-checkout, etc.
+
Note: parallel checkout usually delivers better performance for repositories
located on SSDs or over NFS. For repositories on spinning disks and/or machines
with a small number of cores, the default sequential checkout often performs
better. The size and compression level of a repository might also influence how
well the parallel version performs.
checkout.thresholdForParallelism::
When running parallel checkout with a small number of files, the cost
of subprocess spawning and inter-process communication might outweigh
the parallelization gains. This setting allows to define the minimum
number of files for which parallel checkout should be attempted. The
default is 100.

View File

@ -2,3 +2,7 @@ clone.defaultRemoteName::
The name of the remote to create when cloning a repository. Defaults to
`origin`, and can be overridden by passing the `--origin` command-line
option to linkgit:git-clone[1].
clone.rejectShallow::
Reject to clone a repository if it is a shallow one, can be overridden by
passing option `--reject-shallow` in command line. See linkgit:git-clone[1]

View File

@ -9,26 +9,27 @@ color.advice.hint::
Use customized color for hints.
color.blame.highlightRecent::
This can be used to color the metadata of a blame line depending
on age of the line.
Specify the line annotation color for `git blame --color-by-age`
depending upon the age of the line.
+
This setting should be set to a comma-separated list of color and date settings,
starting and ending with a color, the dates should be set from oldest to newest.
The metadata will be colored given the colors if the line was introduced
before the given timestamp, overwriting older timestamped colors.
This setting should be set to a comma-separated list of color and
date settings, starting and ending with a color, the dates should be
set from oldest to newest. The metadata will be colored with the
specified colors if the line was introduced before the given
timestamp, overwriting older timestamped colors.
+
Instead of an absolute timestamp relative timestamps work as well, e.g.
2.weeks.ago is valid to address anything older than 2 weeks.
Instead of an absolute timestamp relative timestamps work as well,
e.g. `2.weeks.ago` is valid to address anything older than 2 weeks.
+
It defaults to 'blue,12 month ago,white,1 month ago,red', which colors
everything older than one year blue, recent changes between one month and
one year old are kept white, and lines introduced within the last month are
colored red.
It defaults to `blue,12 month ago,white,1 month ago,red`, which
colors everything older than one year blue, recent changes between
one month and one year old are kept white, and lines introduced
within the last month are colored red.
color.blame.repeatedLines::
Use the customized color for the part of git-blame output that
is repeated meta information per line (such as commit id,
author name, date and timezone). Defaults to cyan.
Use the specified color to colorize line annotations for
`git blame --color-lines`, if they come from the same commit as the
preceding line. Defaults to cyan.
color.branch::
A boolean to enable/disable color in the output of
@ -104,9 +105,12 @@ color.grep.<slot>::
`matchContext`;;
matching text in context lines
`matchSelected`;;
matching text in selected lines
matching text in selected lines. Also, used to customize the following
linkgit:git-log[1] subcommands: `--grep`, `--author` and `--committer`.
`selected`;;
non-matching text in selected lines
non-matching text in selected lines. Also, used to customize the
following linkgit:git-log[1] subcommands: `--grep`, `--author` and
`--committer`.
`separator`;;
separators between fields on a line (`:`, `-`, and `=`)
and between hunks (`--`)
@ -127,8 +131,9 @@ color.interactive.<slot>::
interactive commands.
color.pager::
A boolean to enable/disable colored output when the pager is in
use (default is true).
A boolean to specify whether `auto` color modes should colorize
output going to the pager. Defaults to true; set this to false
if your pager does not understand ANSI color codes.
color.push::
A boolean to enable/disable color in push errors. May be set to

View File

@ -1,3 +1,9 @@
commitGraph.generationVersion::
Specifies the type of generation number version to use when writing
or reading the commit-graph file. If version 1 is specified, then
the corrected commit dates will not be written or read. Defaults to
2.
commitGraph.maxNewFilters::
Specifies the default value for the `--max-new-filters` option of `git
commit-graph write` (c.f., linkgit:git-commit-graph[1]).

View File

@ -625,4 +625,6 @@ core.abbrev::
computed based on the approximate number of packed objects
in your repository, which hopefully is enough for
abbreviated object names to stay unique for some time.
If set to "no", no abbreviation is made and the object names
are shown in their full length.
The minimum length is 4.

View File

@ -85,6 +85,8 @@ diff.ignoreSubmodules::
and 'git status' when `status.submoduleSummary` is set unless it is
overridden by using the --ignore-submodules command-line option.
The 'git submodule' commands are not affected by this setting.
By default this is set to untracked so that any untracked
submodules are ignored.
diff.mnemonicPrefix::
If set, 'git diff' uses a prefix pair that is different from the
@ -116,9 +118,10 @@ diff.orderFile::
relative to the top of the working tree.
diff.renameLimit::
The number of files to consider when performing the copy/rename
detection; equivalent to the 'git diff' option `-l`. This setting
has no effect if rename detection is turned off.
The number of files to consider in the exhaustive portion of
copy/rename detection; equivalent to the 'git diff' option
`-l`. If not set, the default value is currently 1000. This
setting has no effect if rename detection is turned off.
diff.renames::
Whether and how Git detects renames. If set to "false",

View File

@ -69,7 +69,8 @@ fetch.negotiationAlgorithm::
setting defaults to "skipping".
Unknown values will cause 'git fetch' to error out.
+
See also the `--negotiation-tip` option for linkgit:git-fetch[1].
See also the `--negotiate-only` and `--negotiation-tip` options to
linkgit:git-fetch[1].
fetch.showForcedUpdates::
Set to false to enable `--no-show-forced-updates` in

View File

@ -11,13 +11,13 @@ gpg.program::
gpg.format::
Specifies which key format to use when signing with `--gpg-sign`.
Default is "openpgp" and another possible value is "x509".
Default is "openpgp". Other possible values are "x509", "ssh".
gpg.<format>.program::
Use this to customize the program used for the signing format you
chose. (see `gpg.program` and `gpg.format`) `gpg.program` can still
be used as a legacy synonym for `gpg.openpgp.program`. The default
value for `gpg.x509.program` is "gpgsm".
value for `gpg.x509.program` is "gpgsm" and `gpg.ssh.program` is "ssh-keygen".
gpg.minTrustLevel::
Specifies a minimum trust level for signature verification. If
@ -33,3 +33,42 @@ gpg.minTrustLevel::
* `marginal`
* `fully`
* `ultimate`
gpg.ssh.defaultKeyCommand:
This command that will be run when user.signingkey is not set and a ssh
signature is requested. On successful exit a valid ssh public key is
expected in the first line of its output. To automatically use the first
available key from your ssh-agent set this to "ssh-add -L".
gpg.ssh.allowedSignersFile::
A file containing ssh public keys which you are willing to trust.
The file consists of one or more lines of principals followed by an ssh
public key.
e.g.: user1@example.com,user2@example.com ssh-rsa AAAAX1...
See ssh-keygen(1) "ALLOWED SIGNERS" for details.
The principal is only used to identify the key and is available when
verifying a signature.
+
SSH has no concept of trust levels like gpg does. To be able to differentiate
between valid signatures and trusted signatures the trust level of a signature
verification is set to `fully` when the public key is present in the allowedSignersFile.
Otherwise the trust level is `undefined` and git verify-commit/tag will fail.
+
This file can be set to a location outside of the repository and every developer
maintains their own trust store. A central repository server could generate this
file automatically from ssh keys with push access to verify the code against.
In a corporate setting this file is probably generated at a global location
from automation that already handles developer ssh keys.
+
A repository that only allows signed commits can store the file
in the repository itself using a path relative to the top-level of the working tree.
This way only committers with an already valid key can add or change keys in the keyring.
+
Using a SSH CA key with the cert-authority option
(see ssh-keygen(1) "CERTIFICATES") is also valid.
gpg.ssh.revocationFile::
Either a SSH KRL or a list of revoked public keys (without the principal prefix).
See ssh-keygen(1) for details.
If a public key is found in this file then it will always be treated
as having trust level "never" and signatures will show as invalid.

View File

@ -11,7 +11,7 @@ gui.displayUntracked::
in the file list. The default is "true".
gui.encoding::
Specifies the default encoding to use for displaying of
Specifies the default character encoding to use for displaying of
file contents in linkgit:git-gui[1] and linkgit:gitk[1].
It can be overridden by setting the 'encoding' attribute
for relevant files (see linkgit:gitattributes[5]).

View File

@ -9,13 +9,15 @@ help.format::
help.autoCorrect::
If git detects typos and can identify exactly one valid command similar
to the error, git will automatically run the intended command after
waiting a duration of time defined by this configuration value in
deciseconds (0.1 sec). If this value is 0, the suggested corrections
will be shown, but not executed. If it is a negative integer, or
"immediate", the suggested command
is run immediately. If "never", suggestions are not shown at all. The
default value is zero.
to the error, git will try to suggest the correct command or even
run the suggestion automatically. Possible config values are:
- 0 (default): show the suggested command.
- positive number: run the suggested command after specified
deciseconds (0.1 sec).
- "immediate": run the suggested command immediately.
- "prompt": show the suggestion and prompt for confirmation to run
the command.
- "never": don't run or show any suggested command.
help.htmlPath::
Specify the path where the HTML documentation resides. File system paths

View File

@ -14,6 +14,11 @@ index.recordOffsetTable::
Defaults to 'true' if index.threads has been explicitly enabled,
'false' otherwise.
index.sparse::
When enabled, write the index using sparse-directory entries. This
has no effect unless `core.sparseCheckout` and
`core.sparseCheckoutCone` are both enabled. Defaults to 'false'.
index.threads::
Specifies the number of threads to spawn when loading the index.
This is meant to reduce index load time on multiprocessor machines.

View File

@ -4,4 +4,4 @@ init.templateDir::
init.defaultBranch::
Allows overriding the default branch name e.g. when initializing
a new repository or when cloning an empty repository.
a new repository.

View File

@ -24,6 +24,11 @@ log.excludeDecoration::
the config option can be overridden by the `--decorate-refs`
option.
log.diffMerges::
Set default diff format to be used for merge commits. See
`--diff-merges` in linkgit:git-log[1] for details.
Defaults to `separate`.
log.follow::
If `true`, `git log` will act as if the `--follow` option was used when
a single <path> is given. This has the same limitations as `--follow`,

View File

@ -0,0 +1,9 @@
lsrefs.unborn::
May be "advertise" (the default), "allow", or "ignore". If "advertise",
the server will respond to the client sending "unborn" (as described in
protocol-v2.txt) and will advertise support for this feature during the
protocol v2 capability advertisement. "allow" is the same as
"advertise" except that the server will not advertise support for this
feature; this is useful for load-balanced servers that cannot be
updated atomically (for example), since the administrator could
configure "allow", then after a delay, configure "advertise".

View File

@ -15,8 +15,9 @@ maintenance.strategy::
* `none`: This default setting implies no task are run at any schedule.
* `incremental`: This setting optimizes for performing small maintenance
activities that do not delete any data. This does not schedule the `gc`
task, but runs the `prefetch` and `commit-graph` tasks hourly and the
`loose-objects` and `incremental-repack` tasks daily.
task, but runs the `prefetch` and `commit-graph` tasks hourly, the
`loose-objects` and `incremental-repack` tasks daily, and the `pack-refs`
task weekly.
maintenance.<task>.enabled::
This boolean config option controls whether the maintenance task

View File

@ -14,7 +14,7 @@ merge.defaultToUpstream::
branches at the remote named by `branch.<current branch>.remote`
are consulted, and then they are mapped via `remote.<remote>.fetch`
to their corresponding remote-tracking branches, and the tips of
these tracking branches are merged.
these tracking branches are merged. Defaults to true.
merge.ff::
By default, Git does not create an extra merge commit when merging
@ -33,10 +33,12 @@ merge.verifySignatures::
include::fmt-merge-msg.txt[]
merge.renameLimit::
The number of files to consider when performing rename detection
during a merge; if not specified, defaults to the value of
diff.renameLimit. This setting has no effect if rename detection
is turned off.
The number of files to consider in the exhaustive portion of
rename detection during a merge. If not specified, defaults
to the value of diff.renameLimit. If neither
merge.renameLimit nor diff.renameLimit are specified,
currently defaults to 7000. This setting has no effect if
rename detection is turned off.
merge.renames::
Whether Git detects renames. If set to "false", rename detection

View File

@ -13,6 +13,11 @@ mergetool.<tool>.cmd::
merged; 'MERGED' contains the name of the file to which the merge
tool should write the results of a successful merge.
mergetool.<tool>.hideResolved::
Allows the user to override the global `mergetool.hideResolved` value
for a specific tool. See `mergetool.hideResolved` for the full
description.
mergetool.<tool>.trustExitCode::
For a custom merge command, specify whether the exit code of
the merge command can be used to determine whether the merge was
@ -40,6 +45,16 @@ mergetool.meld.useAutoMerge::
value of `false` avoids using `--auto-merge` altogether, and is the
default value.
mergetool.hideResolved::
During a merge Git will automatically resolve as many conflicts as
possible and write the 'MERGED' file containing conflict markers around
any conflicts that it cannot resolve; 'LOCAL' and 'REMOTE' normally
represent the versions of the file from before Git's conflict
resolution. This flag causes 'LOCAL' and 'REMOTE' to be overwriten so
that only the unresolved conflicts are presented to the merge tool. Can
be configured per-tool via the `mergetool.<tool>.hideResolved`
configuration variable. Defaults to `false`.
mergetool.keepBackup::
After performing a merge, the original file with conflict markers
can be saved as a file with a `.orig` extension. If this variable

View File

@ -99,12 +99,23 @@ pack.packSizeLimit::
packing to a file when repacking, i.e. the git:// protocol
is unaffected. It can be overridden by the `--max-pack-size`
option of linkgit:git-repack[1]. Reaching this limit results
in the creation of multiple packfiles; which in turn prevents
bitmaps from being created.
The minimum size allowed is limited to 1 MiB.
The default is unlimited.
Common unit suffixes of 'k', 'm', or 'g' are
supported.
in the creation of multiple packfiles.
+
Note that this option is rarely useful, and may result in a larger total
on-disk size (because Git will not store deltas between packs), as well
as worse runtime performance (object lookup within multiple packs is
slower than a single pack, and optimizations like reachability bitmaps
cannot cope with multiple packs).
+
If you need to actively run Git using smaller packfiles (e.g., because your
filesystem does not support large files), this option may help. But if
your goal is to transmit a packfile over a medium that supports limited
sizes (e.g., removable media that cannot store the whole repository),
you are likely better off creating a single large packfile and splitting
it using a generic multi-volume archive tool (e.g., Unix `split`).
+
The minimum size allowed is limited to 1 MiB. The default is unlimited.
Common unit suffixes of 'k', 'm', or 'g' are supported.
pack.useBitmaps::
When true, git will use pack bitmaps (if available) when packing
@ -122,6 +133,21 @@ pack.useSparse::
commits contain certain types of direct renames. Default is
`true`.
pack.preferBitmapTips::
When selecting which commits will receive bitmaps, prefer a
commit at the tip of any reference that is a suffix of any value
of this configuration over any other commits in the "selection
window".
+
Note that setting this configuration to `refs/foo` does not mean that
the commits at the tips of `refs/foo/bar` and `refs/foo/baz` will
necessarily be selected. This is because commits are selected for
bitmaps from within a series of windows of variable length.
+
If a commit at the tip of any reference which is a suffix of any value
of this configuration is seen in a window, it is immediately given
preference over any other commit in that window.
pack.writeBitmaps (deprecated)::
This is a deprecated synonym for `repack.writeBitmaps`.
@ -133,3 +159,14 @@ pack.writeBitmapHashCache::
between an older, bitmapped pack and objects that have been
pushed since the last gc). The downside is that it consumes 4
bytes per object of disk space. Defaults to true.
+
When writing a multi-pack reachability bitmap, no new namehashes are
computed; instead, any namehashes stored in an existing bitmap are
permuted into their appropriate location when writing a new bitmap.
pack.writeReverseIndex::
When true, git will write a corresponding .rev file (see:
link:../technical/pack-format.html[Documentation/technical/pack-format.txt])
for each new packfile that it writes in all places except for
linkgit:git-fast-import[1] and in the bulk checkin mechanism.
Defaults to false.

View File

@ -1,10 +1,10 @@
protocol.allow::
If set, provide a user defined default policy for all protocols which
don't explicitly have a policy (`protocol.<name>.allow`). By default,
if unset, known-safe protocols (http, https, git, ssh) have a
if unset, known-safe protocols (http, https, git, ssh, file) have a
default policy of `always`, known-dangerous protocols (ext) have a
default policy of `never`, and all other protocols (including file)
have a default policy of `user`. Supported policies:
default policy of `never`, and all other protocols have a default
policy of `user`. Supported policies:
+
--

View File

@ -18,10 +18,6 @@ When `merges` (or just 'm'), pass the `--rebase-merges` option to 'git rebase'
so that the local merge commits are included in the rebase (see
linkgit:git-rebase[1] for details).
+
When `preserve` (or just 'p', deprecated in favor of `merges`), also pass
`--preserve-merges` along to 'git rebase' so that locally committed merge
commits will not be flattened by running 'git pull'.
+
When the value is `interactive` (or just 'i'), the rebase is run in interactive
mode.
+

View File

@ -24,15 +24,14 @@ push.default::
* `tracking` - This is a deprecated synonym for `upstream`.
* `simple` - in centralized workflow, work like `upstream` with an
added safety to refuse to push if the upstream branch's name is
different from the local one.
* `simple` - pushes the current branch with the same name on the remote.
+
When pushing to a remote that is different from the remote you normally
pull from, work as `current`. This is the safest option and is suited
for beginners.
If you are working on a centralized workflow (pushing to the same repository you
pull from, which is typically `origin`), then you need to configure an upstream
branch with the same name.
+
This mode has become the default in Git 2.0.
This mode is the default since Git 2.0, and is the safest option suited for
beginners.
* `matching` - push all branches having the same name on both ends.
This makes the repository you are pushing to remember the set of
@ -120,3 +119,10 @@ push.useForceIfIncludes::
`--force-if-includes` as an option to linkgit:git-push[1]
in the command line. Adding `--no-force-if-includes` at the
time of push overrides this configuration setting.
push.negotiate::
If set to "true", attempt to reduce the size of the packfile
sent by rounds of negotiation in which the client and the
server attempt to find commits in common. If "false", Git will
rely solely on the server's ref advertisement to find commits
in common.

View File

@ -1,10 +1,3 @@
rebase.useBuiltin::
Unused configuration variable. Used in Git versions 2.20 and
2.21 as an escape hatch to enable the legacy shellscript
implementation of rebase. Now the built-in rewrite of it in C
is always used. Setting this will emit a warning, to alert any
remaining users that setting this now does nothing.
rebase.backend::
Default backend to use for rebasing. Possible choices are
'apply' or 'merge'. In the future, if the merge backend gains
@ -68,3 +61,6 @@ rebase.rescheduleFailedExec::
Automatically reschedule `exec` commands that failed. This only makes
sense in interactive mode (or when an `--exec` option was provided).
This is the same as specifying the `--reschedule-failed-exec` option.
rebase.forkPoint::
If set to false set `--no-fork-point` option by default.

View File

@ -1,42 +0,0 @@
safe.directory::
These config entries specify Git-tracked directories that are
considered safe even if they are owned by someone other than the
current user. By default, Git will refuse to even parse a Git
config of a repository owned by someone else, let alone run its
hooks, and this config setting allows users to specify exceptions,
e.g. for intentionally shared repositories (see the `--shared`
option in linkgit:git-init[1]).
+
This is a multi-valued setting, i.e. you can add more than one directory
via `git config --add`. To reset the list of safe directories (e.g. to
override any such directories specified in the system config), add a
`safe.directory` entry with an empty value.
+
This config setting is only respected when specified in a system or global
config, not when it is specified in a repository config or via the command
line option `-c safe.directory=<path>`.
+
The value of this setting is interpolated, i.e. `~/<path>` expands to a
path relative to the home directory and `%(prefix)/<path>` expands to a
path relative to Git's (runtime) prefix.
+
To completely opt-out of this security check, set `safe.directory` to the
string `*`. This will allow all repositories to be treated as if their
directory was listed in the `safe.directory` list. If `safe.directory=*`
is set in system config and you want to re-enable this protection, then
initialize your list with an empty value before listing the repositories
that you deem safe.
+
As explained, Git only allows you to access repositories owned by
yourself, i.e. the user who is running Git, by default. When Git
is running as 'root' in a non Windows platform that provides sudo,
however, git checks the SUDO_UID environment variable that sudo creates
and will allow access to the uid recorded as its value in addition to
the id from 'root'.
This is to make it easy to perform a common sequence during installation
"make && sudo make install". A git process running under 'sudo' runs as
'root' but the 'sudo' command exports the environment variable to record
which id the original user has.
If that is not what you would prefer and want git to only trust
repositories that are owned by root instead, then you can remove
the `SUDO_UID` variable from root's environment before invoking git.

View File

@ -8,9 +8,6 @@ sendemail.smtpEncryption::
See linkgit:git-send-email[1] for description. Note that this
setting is not subject to the 'identity' mechanism.
sendemail.smtpssl (deprecated)::
Deprecated alias for 'sendemail.smtpEncryption = ssl'.
sendemail.smtpsslcertpath::
Path to ca-certificates (either a directory or a single file).
Set it to an empty string to disable certificate verification.

View File

@ -5,6 +5,11 @@ stash.useBuiltin::
is always used. Setting this will emit a warning, to alert any
remaining users that setting this now does nothing.
stash.showIncludeUntracked::
If this is set to true, the `git stash show` command will show
the untracked files of a stash entry. Defaults to false. See
description of 'show' command in linkgit:git-stash[1].
stash.showPatch::
If this is set to true, the `git stash show` command without an
option will show the stash entry in patch form. Defaults to false.

View File

@ -58,8 +58,9 @@ submodule.active::
commands. See linkgit:gitsubmodules[7] for details.
submodule.recurse::
Specifies if commands recurse into submodules by default. This
applies to all commands that have a `--recurse-submodules` option
A boolean indicating if commands should enable the `--recurse-submodules`
option by default.
Applies to all commands that support this option
(`checkout`, `fetch`, `grep`, `pull`, `push`, `read-tree`, `reset`,
`restore` and `switch`) except `clone` and `ls-files`.
Defaults to false.

View File

@ -52,13 +52,17 @@ If you have multiple hideRefs values, later entries override earlier ones
(and entries in more-specific config files override less-specific ones).
+
If a namespace is in use, the namespace prefix is stripped from each
reference before it is matched against `transfer.hiderefs` patterns.
reference before it is matched against `transfer.hiderefs` patterns. In
order to match refs before stripping, add a `^` in front of the ref name. If
you combine `!` and `^`, `!` must be specified first.
+
For example, if `refs/heads/master` is specified in `transfer.hideRefs` and
the current namespace is `foo`, then `refs/namespaces/foo/refs/heads/master`
is omitted from the advertisements but `refs/heads/master` and
`refs/namespaces/bar/refs/heads/master` are still advertised as so-called
"have" lines. In order to match refs before stripping, add a `^` in front of
the ref name. If you combine `!` and `^`, `!` must be specified first.
is omitted from the advertisements. If `uploadpack.allowRefInWant` is set,
`upload-pack` will treat `want-ref refs/heads/master` in a protocol v2
`fetch` command as if `refs/namespaces/foo/refs/heads/master` did not exist.
`receive-pack`, on the other hand, will still advertise the object id the
ref is pointing to without mentioning its name (a so-called ".have" line).
+
Even if you hide refs, a client may still be able to steal the target
objects via the techniques described in the "SECURITY" section of the

View File

@ -59,15 +59,16 @@ uploadpack.allowFilter::
uploadpackfilter.allow::
Provides a default value for unspecified object filters (see: the
below configuration variable).
below configuration variable). If set to `true`, this will also
enable all filters which get added in the future.
Defaults to `true`.
uploadpackfilter.<filter>.allow::
Explicitly allow or ban the object filter corresponding to
`<filter>`, where `<filter>` may be one of: `blob:none`,
`blob:limit`, `tree`, `sparse:oid`, or `combine`. If using
combined filters, both `combine` and all of the nested filter
kinds must be allowed. Defaults to `uploadpackfilter.allow`.
`blob:limit`, `object:type`, `tree`, `sparse:oid`, or `combine`.
If using combined filters, both `combine` and all of the nested
filter kinds must be allowed. Defaults to `uploadpackfilter.allow`.
uploadpackfilter.tree.maxDepth::
Only allow `--filter=tree:<n>` when `<n>` is no more than the value of

View File

@ -36,3 +36,10 @@ user.signingKey::
commit, you can override the default selection with this variable.
This option is passed unchanged to gpg's --local-user parameter,
so you may specify a key using any method that gpg supports.
If gpg.format is set to "ssh" this can contain the literal ssh public
key (e.g.: "ssh-rsa XXXXXX identifier") or a file which contains it and
corresponds to the private key used for signing. The private key
needs to be available via ssh-agent. Alternatively it can be set to
a file containing a private key directly. If not set git will call
gpg.ssh.defaultKeyCommand (e.g.: "ssh-add -L") and try to use the first
key available.

View File

@ -1,10 +1,7 @@
DATE FORMATS
------------
The `GIT_AUTHOR_DATE`, `GIT_COMMITTER_DATE` environment variables
ifdef::git-commit[]
and the `--date` option
endif::git-commit[]
The `GIT_AUTHOR_DATE` and `GIT_COMMITTER_DATE` environment variables
support the following date formats:
Git internal format::
@ -26,3 +23,9 @@ ISO 8601::
+
NOTE: In addition, the date part is accepted in the following formats:
`YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
ifdef::git-commit[]
In addition to recognizing all date formats above, the `--date` option
will also try to make sense of other, more human-centric date formats,
such as relative dates like "yesterday" or "last Friday at noon".
endif::git-commit[]

View File

@ -59,7 +59,7 @@ Possible status letters are:
- D: deletion of a file
- M: modification of the contents or mode of a file
- R: renaming of a file
- T: change in the type of the file
- T: change in the type of the file (regular file, symbolic link or submodule)
- U: file is unmerged (you must complete the merge before it can
be committed)
- X: "unknown" change type (most probably a bug, please report it)

View File

@ -11,7 +11,7 @@ linkgit:git-diff-files[1]
with the `-p` option produces patch text.
You can customize the creation of patch text via the
`GIT_EXTERNAL_DIFF` and the `GIT_DIFF_OPTS` environment variables
(see linkgit:git[1]).
(see linkgit:git[1]), and the `diff` attribute (see linkgit:gitattributes[5]).
What the -p option produces is slightly different from the traditional
diff format:
@ -74,6 +74,11 @@ separate lines indicate the old and the new mode.
rename from b
rename to a
5. Hunk headers mention the name of the function to which the hunk
applies. See "Defining a custom hunk-header" in
linkgit:gitattributes[5] for details of how to tailor to this to
specific languages.
Combined diff format
--------------------
@ -81,9 +86,9 @@ Combined diff format
Any diff-generating command can take the `-c` or `--cc` option to
produce a 'combined diff' when showing a merge. This is the default
format when showing merges with linkgit:git-diff[1] or
linkgit:git-show[1]. Note also that you can give the `-m` option to any
of these commands to force generation of diffs with individual parents
of a merge.
linkgit:git-show[1]. Note also that you can give suitable
`--diff-merges` option to any of these commands to force generation of
diffs in specific format.
A "combined diff" format looks like this:

View File

@ -33,6 +33,64 @@ endif::git-diff[]
show the patch by default, or to cancel the effect of `--patch`.
endif::git-format-patch[]
ifdef::git-log[]
--diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
--no-diff-merges::
Specify diff format to be used for merge commits. Default is
{diff-merges-default} unless `--first-parent` is in use, in which case
`first-parent` is the default.
+
--diff-merges=(off|none):::
--no-diff-merges:::
Disable output of diffs for merge commits. Useful to override
implied value.
+
--diff-merges=on:::
--diff-merges=m:::
-m:::
This option makes diff output for merge commits to be shown in
the default format. `-m` will produce the output only if `-p`
is given as well. The default format could be changed using
`log.diffMerges` configuration parameter, which default value
is `separate`.
+
--diff-merges=first-parent:::
--diff-merges=1:::
This option makes merge commits show the full diff with
respect to the first parent only.
+
--diff-merges=separate:::
This makes merge commits show the full diff with respect to
each of the parents. Separate log entry and diff is generated
for each parent.
+
--diff-merges=combined:::
--diff-merges=c:::
-c:::
With this option, diff output for a merge commit shows the
differences from each of the parents to the merge result
simultaneously instead of showing pairwise diff between a
parent and the result one at a time. Furthermore, it lists
only files which were modified from all parents. `-c` implies
`-p`.
+
--diff-merges=dense-combined:::
--diff-merges=cc:::
--cc:::
With this option the output produced by
`--diff-merges=combined` is further compressed by omitting
uninteresting hunks whose contents in the parents have only
two variants and the merge result picks one of them without
modification. `--cc` implies `-p`.
--combined-all-paths::
This flag causes combined diffs (used for merge commits) to
list the name of the file from all parents. It thus only has
effect when `--diff-merges=[dense-]combined` is in use, and
is likely only useful if filename changes are detected (i.e.
when either rename or copy detection have been requested).
endif::git-log[]
-U<n>::
--unified=<n>::
Generate diffs with <n> lines of context instead of
@ -242,11 +300,14 @@ explained for the configuration variable `core.quotePath` (see
linkgit:git-config[1]).
--name-only::
Show only names of changed files.
Show only names of changed files. The file names are often encoded in UTF-8.
For more information see the discussion about encoding in the linkgit:git-log[1]
manual page.
--name-status::
Show only names and status of changed files. See the description
of the `--diff-filter` option on what the status letters mean.
Just like `--name-only` the file names are often encoded in UTF-8.
--submodule[=<format>]::
Specify how differences in submodules are shown. When specifying
@ -527,11 +588,17 @@ When used together with `-B`, omit also the preimage in the deletion part
of a delete/create pair.
-l<num>::
The `-M` and `-C` options require O(n^2) processing time where n
is the number of potential rename/copy targets. This
option prevents rename/copy detection from running if
the number of rename/copy targets exceeds the specified
number.
The `-M` and `-C` options involve some preliminary steps that
can detect subsets of renames/copies cheaply, followed by an
exhaustive fallback portion that compares all remaining
unpaired destinations to all relevant sources. (For renames,
only remaining unpaired sources are relevant; for copies, all
original sources are relevant.) For N sources and
destinations, this exhaustive check is O(N^2). This option
prevents the exhaustive portion of rename/copy detection from
running if the number of source/destination files involved
exceeds the specified number. Defaults to diff.renameLimit.
Note that a value of 0 is treated as unlimited.
ifndef::git-format-patch[]
--diff-filter=[(A|C|D|M|R|T|U|X|B)...[*]]::
@ -649,6 +716,14 @@ matches a pattern if removing any number of the final pathname
components matches the pattern. For example, the pattern "`foo*bar`"
matches "`fooasdfbar`" and "`foo/bar/baz/asdf`" but not "`foobarx`".
--skip-to=<file>::
--rotate-to=<file>::
Discard the files before the named <file> from the output
(i.e. 'skip to'), or move them to the end of the output
(i.e. 'rotate to'). These were invented primarily for use
of the `git difftool` command, and may not be very useful
otherwise.
ifndef::git-format-patch[]
-R::
Swap two inputs; that is, show differences from index or

View File

@ -7,6 +7,10 @@
existing contents of `.git/FETCH_HEAD`. Without this
option old data in `.git/FETCH_HEAD` will be overwritten.
--atomic::
Use an atomic transaction to update local refs. Either all refs are
updated, or on error, no refs are updated.
--depth=<depth>::
Limit fetching to the specified number of commits from the tip of
each remote branch history. If fetching to a 'shallow' repository
@ -58,8 +62,17 @@ The argument to this option may be a glob on ref names, a ref, or the (possibly
abbreviated) SHA-1 of a commit. Specifying a glob is equivalent to specifying
this option multiple times, one for each matching ref name.
+
See also the `fetch.negotiationAlgorithm` configuration variable
documented in linkgit:git-config[1].
See also the `fetch.negotiationAlgorithm` and `push.negotiate`
configuration variables documented in linkgit:git-config[1], and the
`--negotiate-only` option below.
--negotiate-only::
Do not fetch anything from the server, and instead print the
ancestors of the provided `--negotiation-tip=*` arguments,
which we have in common with the server.
+
Internally this is used to implement the `push.negotiate` option, see
linkgit:git-config[1].
--dry-run::
Show what would be done, without making any changes.
@ -106,6 +119,11 @@ ifndef::git-pull[]
setting `fetch.writeCommitGraph`.
endif::git-pull[]
--prefetch::
Modify the configured refspec to place all refs into the
`refs/prefetch/` namespace. See the `prefetch` task in
linkgit:git-maintenance[1].
-p::
--prune::
Before fetching, remove any remote-tracking references that no

View File

@ -9,7 +9,7 @@ SYNOPSIS
--------
[verse]
'git add' [--verbose | -v] [--dry-run | -n] [--force | -f] [--interactive | -i] [--patch | -p]
[--edit | -e] [--[no-]all | --[no-]ignore-removal | [--update | -u]]
[--edit | -e] [--[no-]all | --[no-]ignore-removal | [--update | -u]] [--sparse]
[--intent-to-add | -N] [--refresh] [--ignore-errors] [--ignore-missing] [--renormalize]
[--chmod=(+|-)x] [--pathspec-from-file=<file> [--pathspec-file-nul]]
[--] [<pathspec>...]
@ -79,6 +79,13 @@ in linkgit:gitglossary[7].
--force::
Allow adding otherwise ignored files.
--sparse::
Allow updating index entries outside of the sparse-checkout cone.
Normally, `git add` refuses to update index entries whose paths do
not fit within the sparse-checkout cone, since those files might
be removed from the working tree without warning. See
linkgit:git-sparse-checkout[1] for more details.
-i::
--interactive::
Add modified contents in the working tree interactively to

View File

@ -15,6 +15,7 @@ SYNOPSIS
[--whitespace=<option>] [-C<n>] [-p<n>] [--directory=<dir>]
[--exclude=<path>] [--include=<path>] [--reject] [-q | --quiet]
[--[no-]scissors] [-S[<keyid>]] [--patch-format=<format>]
[--quoted-cr=<action>]
[(<mbox> | <Maildir>)...]
'git am' (--continue | --skip | --abort | --quit | --show-current-patch[=(diff|raw)])
@ -59,6 +60,9 @@ OPTIONS
--no-scissors::
Ignore scissors lines (see linkgit:git-mailinfo[1]).
--quoted-cr=<action>::
This flag will be passed down to 'git mailinfo' (see linkgit:git-mailinfo[1]).
-m::
--message-id::
Pass the `-m` flag to 'git mailinfo' (see linkgit:git-mailinfo[1]),
@ -79,7 +83,7 @@ OPTIONS
Pass `-u` flag to 'git mailinfo' (see linkgit:git-mailinfo[1]).
The proposed commit log message taken from the e-mail
is re-coded into UTF-8 encoding (configuration variable
`i18n.commitencoding` can be used to specify project's
`i18n.commitEncoding` can be used to specify project's
preferred encoding if it is not UTF-8).
+
This was optional in prior versions of git, but now it is the
@ -174,6 +178,8 @@ default. You can use `--no-utf8` to override this.
--abort::
Restore the original branch and abort the patching operation.
Revert contents of files involved in the am operation to their
pre-am state.
--quit::
Abort the patching operation but keep HEAD and the index

View File

@ -84,12 +84,13 @@ OPTIONS
-3::
--3way::
When the patch does not apply cleanly, fall back on 3-way merge if
the patch records the identity of blobs it is supposed to apply to,
and we have those blobs available locally, possibly leaving the
Attempt 3-way merge if the patch records the identity of blobs it is supposed
to apply to and we have those blobs available locally, possibly leaving the
conflict markers in the files in the working tree for the user to
resolve. This option implies the `--index` option, and is incompatible
with the `--reject` and the `--cached` options.
resolve. This option implies the `--index` option unless the
`--cached` option is used, and is incompatible with the `--reject` option.
When used with the `--cached` option, any conflicts are left at higher stages
in the cache.
--build-fake-ancestor=<file>::
Newer 'git diff' output has embedded 'index information'

View File

@ -93,12 +93,19 @@ BACKEND EXTRA OPTIONS
zip
~~~
-0::
Store the files instead of deflating them.
-9::
Highest and slowest compression level. You can specify any
number from 1 to 9 to adjust compression speed and ratio.
-<digit>::
Specify compression level. Larger values allow the command
to spend more time to compress to smaller size. Supported
values are from `-0` (store-only) to `-9` (best ratio).
Default is `-6` if not given.
tar
~~~
-<number>::
Specify compression level. The value will be passed to the
compression command configured in `tar.<format>.command`. See
manual page of the configured command for the list of supported
levels and the default level if this option isn't specified.
CONFIGURATION
-------------

View File

@ -11,8 +11,8 @@ SYNOPSIS
'git blame' [-c] [-b] [-l] [--root] [-t] [-f] [-n] [-s] [-e] [-p] [-w] [--incremental]
[-L <range>] [-S <revs-file>] [-M] [-C] [-C] [-C] [--since=<date>]
[--ignore-rev <rev>] [--ignore-revs-file <file>]
[--progress] [--abbrev=<n>] [<rev> | --contents <file> | --reverse <rev>..<rev>]
[--] <file>
[--color-lines] [--color-by-age] [--progress] [--abbrev=<n>]
[<rev> | --contents <file> | --reverse <rev>..<rev>] [--] <file>
DESCRIPTION
-----------
@ -93,6 +93,19 @@ include::blame-options.txt[]
is used for a caret to mark the boundary commit.
THE DEFAULT FORMAT
------------------
When neither `--porcelain` nor `--incremental` option is specified,
`git blame` will output annotation for each line with:
- abbreviated object name for the commit the line came from;
- author ident (by default author name and date, unless `-s` or `-e`
is specified); and
- line number
before the line contents.
THE PORCELAIN FORMAT
--------------------
@ -226,7 +239,7 @@ commit commentary), a blame viewer will not care.
MAPPING AUTHORS
---------------
include::mailmap.txt[]
See linkgit:gitmailmap[5].
SEE ALSO

View File

@ -78,8 +78,8 @@ renaming. If <newbranch> exists, -M must be used to force the rename
to happen.
The `-c` and `-C` options have the exact same semantics as `-m` and
`-M`, except instead of the branch being renamed it along with its
config and reflog will be copied to a new name.
`-M`, except instead of the branch being renamed, it will be copied to a
new name, along with its config and reflog.
With a `-d` or `-D` option, `<branchname>` will be deleted. You may
specify more than one branch for deletion. If the branch currently
@ -118,20 +118,21 @@ OPTIONS
Reset <branchname> to <startpoint>, even if <branchname> exists
already. Without `-f`, 'git branch' refuses to change an existing branch.
In combination with `-d` (or `--delete`), allow deleting the
branch irrespective of its merged status. In combination with
branch irrespective of its merged status, or whether it even
points to a valid commit. In combination with
`-m` (or `--move`), allow renaming the branch even if the new
branch name already exists, the same applies for `-c` (or `--copy`).
-m::
--move::
Move/rename a branch and the corresponding reflog.
Move/rename a branch, together with its config and reflog.
-M::
Shortcut for `--move --force`.
-c::
--copy::
Copy a branch and the corresponding reflog.
Copy a branch, together with its config and reflog.
-C::
Shortcut for `--copy --force`.
@ -153,7 +154,7 @@ OPTIONS
--column[=<options>]::
--no-column::
Display branch listing in columns. See configuration variable
column.branch for option syntax.`--column` and `--no-column`
`column.branch` for option syntax. `--column` and `--no-column`
without options are equivalent to 'always' and 'never' respectively.
+
This option is only applicable in non-verbose mode.

View File

@ -40,8 +40,8 @@ OPTIONS
-------
-o <path>::
--output-directory <path>::
Place the resulting bug report file in `<path>` instead of the root of
the Git repository.
Place the resulting bug report file in `<path>` instead of the current
directory.
-s <format>::
--suffix <format>::

View File

@ -13,26 +13,53 @@ SYNOPSIS
[--version=<version>] <file> <git-rev-list-args>
'git bundle' verify [-q | --quiet] <file>
'git bundle' list-heads <file> [<refname>...]
'git bundle' unbundle <file> [<refname>...]
'git bundle' unbundle [--progress] <file> [<refname>...]
DESCRIPTION
-----------
Some workflows require that one or more branches of development on one
machine be replicated on another machine, but the two machines cannot
be directly connected, and therefore the interactive Git protocols (git,
ssh, http) cannot be used.
Create, unpack, and manipulate "bundle" files. Bundles are used for
the "offline" transfer of Git objects without an active "server"
sitting on the other side of the network connection.
The 'git bundle' command packages objects and references in an archive
at the originating machine, which can then be imported into another
repository using 'git fetch', 'git pull', or 'git clone',
after moving the archive by some means (e.g., by sneakernet).
They can be used to create both incremental and full backups of a
repository, and to relay the state of the references in one repository
to another.
As no
direct connection between the repositories exists, the user must specify a
basis for the bundle that is held by the destination repository: the
bundle assumes that all objects in the basis are already in the
destination repository.
Git commands that fetch or otherwise "read" via protocols such as
`ssh://` and `https://` can also operate on bundle files. It is
possible linkgit:git-clone[1] a new repository from a bundle, to use
linkgit:git-fetch[1] to fetch from one, and to list the references
contained within it with linkgit:git-ls-remote[1]. There's no
corresponding "write" support, i.e.a 'git push' into a bundle is not
supported.
See the "EXAMPLES" section below for examples of how to use bundles.
BUNDLE FORMAT
-------------
Bundles are `.pack` files (see linkgit:git-pack-objects[1]) with a
header indicating what references are contained within the bundle.
Like the the packed archive format itself bundles can either be
self-contained, or be created using exclusions.
See the "OBJECT PREREQUISITES" section below.
Bundles created using revision exclusions are "thin packs" created
using the `--thin` option to linkgit:git-pack-objects[1], and
unbundled using the `--fix-thin` option to linkgit:git-index-pack[1].
There is no option to create a "thick pack" when using revision
exclusions, and users should not be concerned about the difference. By
using "thin packs", bundles created using exclusions are smaller in
size. That they're "thin" under the hood is merely noted here as a
curiosity, and as a reference to other documentation.
See link:technical/bundle-format.html[the `bundle-format`
documentation] for more details and the discussion of "thin pack" in
link:technical/pack-format.html[the pack format documentation] for
further details.
OPTIONS
-------
@ -117,28 +144,88 @@ unbundle <file>::
SPECIFYING REFERENCES
---------------------
'git bundle' will only package references that are shown by
'git show-ref': this includes heads, tags, and remote heads. References
such as `master~1` cannot be packaged, but are perfectly suitable for
defining the basis. More than one reference may be packaged, and more
than one basis can be specified. The objects packaged are those not
contained in the union of the given bases. Each basis can be
specified explicitly (e.g. `^master~10`), or implicitly (e.g.
`master~10..master`, `--since=10.days.ago master`).
Revisions must be accompanied by reference names to be packaged in a
bundle.
More than one reference may be packaged, and more than one set of prerequisite objects can
be specified. The objects packaged are those not contained in the
union of the prerequisites.
The 'git bundle create' command resolves the reference names for you
using the same rules as `git rev-parse --abbrev-ref=loose`. Each
prerequisite can be specified explicitly (e.g. `^master~10`), or implicitly
(e.g. `master~10..master`, `--since=10.days.ago master`).
All of these simple cases are OK (assuming we have a "master" and
"next" branch):
----------------
$ git bundle create master.bundle master
$ echo master | git bundle create master.bundle --stdin
$ git bundle create master-and-next.bundle master next
$ (echo master; echo next) | git bundle create master-and-next.bundle --stdin
----------------
And so are these (and the same but omitted `--stdin` examples):
----------------
$ git bundle create recent-master.bundle master~10..master
$ git bundle create recent-updates.bundle master~10..master next~5..next
----------------
A revision name or a range whose right-hand-side cannot be resolved to
a reference is not accepted:
----------------
$ git bundle create HEAD.bundle $(git rev-parse HEAD)
fatal: Refusing to create empty bundle.
$ git bundle create master-yesterday.bundle master~10..master~5
fatal: Refusing to create empty bundle.
----------------
OBJECT PREREQUISITES
--------------------
When creating bundles it is possible to create a self-contained bundle
that can be unbundled in a repository with no common history, as well
as providing negative revisions to exclude objects needed in the
earlier parts of the history.
Feeding a revision such as `new` to `git bundle create` will create a
bundle file that contains all the objects reachable from the revision
`new`. That bundle can be unbundled in any repository to obtain a full
history that leads to the revision `new`:
----------------
$ git bundle create full.bundle new
----------------
A revision range such as `old..new` will produce a bundle file that
will require the revision `old` (and any objects reachable from it)
to exist for the bundle to be "unbundle"-able:
----------------
$ git bundle create full.bundle old..new
----------------
A self-contained bundle without any prerequisites can be extracted
into anywhere, even into an empty repository, or be cloned from
(i.e., `new`, but not `old..new`).
It is very important that the basis used be held by the destination.
It is okay to err on the side of caution, causing the bundle file
to contain objects already in the destination, as these are ignored
when unpacking at the destination.
`git clone` can use any bundle created without negative refspecs
(e.g., `new`, but not `old..new`).
If you want to match `git clone --mirror`, which would include your
refs such as `refs/remotes/*`, use `--all`.
If you want to provide the same set of refs that a clone directly
from the source repository would get, use `--branches --tags` for
the `<git-rev-list-args>`.
The 'git bundle verify' command can be used to check whether your
recipient repository has the required prerequisite commits for a
bundle.
EXAMPLES
--------
@ -149,7 +236,7 @@ but we can move data from A to B via some mechanism (CD, email, etc.).
We want to update R2 with development made on the branch master in R1.
To bootstrap the process, you can first create a bundle that does not have
any basis. You can use a tag to remember up to what commit you last
any prerequisites. You can use a tag to remember up to what commit you last
processed, in order to make it easy to later update the other repository
with an incremental bundle:
@ -200,7 +287,7 @@ machineB$ git pull
If you know up to what commit the intended recipient repository should
have the necessary objects, you can use that knowledge to specify the
basis, giving a cut-off point to limit the revisions and objects that go
prerequisites, giving a cut-off point to limit the revisions and objects that go
in the resulting bundle. The previous example used the lastR2bundle tag
for this purpose, but you can use any other options that you would give to
the linkgit:git-log[1] command. Here are more examples:
@ -211,7 +298,7 @@ You can use a tag that is present in both:
$ git bundle create mybundle v1.0.0..master
----------------
You can use a basis based on time:
You can use a prerequisite based on time:
----------------
$ git bundle create mybundle --since=10.days master
@ -224,7 +311,7 @@ $ git bundle create mybundle -10 master
----------------
You can run `git-bundle verify` to see if you can extract from a bundle
that was created with a basis:
that was created with a prerequisite:
----------------
$ git bundle verify mybundle

View File

@ -35,42 +35,42 @@ OPTIONS
-t::
Instead of the content, show the object type identified by
<object>.
`<object>`.
-s::
Instead of the content, show the object size identified by
<object>.
`<object>`.
-e::
Exit with zero status if <object> exists and is a valid
object. If <object> is of an invalid format exit with non-zero and
Exit with zero status if `<object>` exists and is a valid
object. If `<object>` is of an invalid format exit with non-zero and
emits an error on stderr.
-p::
Pretty-print the contents of <object> based on its type.
Pretty-print the contents of `<object>` based on its type.
<type>::
Typically this matches the real type of <object> but asking
Typically this matches the real type of `<object>` but asking
for a type that can trivially be dereferenced from the given
<object> is also permitted. An example is to ask for a
"tree" with <object> being a commit object that contains it,
or to ask for a "blob" with <object> being a tag object that
`<object>` is also permitted. An example is to ask for a
"tree" with `<object>` being a commit object that contains it,
or to ask for a "blob" with `<object>` being a tag object that
points at it.
--textconv::
Show the content as transformed by a textconv filter. In this case,
<object> has to be of the form <tree-ish>:<path>, or :<path> in
`<object>` has to be of the form `<tree-ish>:<path>`, or `:<path>` in
order to apply the filter to the content recorded in the index at
<path>.
`<path>`.
--filters::
Show the content as converted by the filters configured in
the current working tree for the given <path> (i.e. smudge filters,
end-of-line conversion, etc). In this case, <object> has to be of
the form <tree-ish>:<path>, or :<path>.
the current working tree for the given `<path>` (i.e. smudge filters,
end-of-line conversion, etc). In this case, `<object>` has to be of
the form `<tree-ish>:<path>`, or `:<path>`.
--path=<path>::
For use with --textconv or --filters, to allow specifying an object
For use with `--textconv` or `--filters`, to allow specifying an object
name and a path separately, e.g. when it is difficult to figure out
the revision from which the blob came.
@ -94,8 +94,10 @@ OPTIONS
Instead of reading a list of objects on stdin, perform the
requested batch operation on all objects in the repository and
any alternate object stores (not just reachable objects).
Requires `--batch` or `--batch-check` be specified. Note that
the objects are visited in order sorted by their hashes.
Requires `--batch` or `--batch-check` be specified. By default,
the objects are visited in order sorted by their hashes; see
also `--unordered` below. Objects are presented as-is, without
respecting the "replace" mechanism of linkgit:git-replace[1].
--buffer::
Normally batch output is flushed after each object is output, so
@ -115,15 +117,15 @@ OPTIONS
repository.
--allow-unknown-type::
Allow -s or -t to query broken/corrupt objects of unknown type.
Allow `-s` or `-t` to query broken/corrupt objects of unknown type.
--follow-symlinks::
With --batch or --batch-check, follow symlinks inside the
With `--batch` or `--batch-check`, follow symlinks inside the
repository when requesting objects with extended SHA-1
expressions of the form tree-ish:path-in-tree. Instead of
providing output about the link itself, provide output about
the linked-to object. If a symlink points outside the
tree-ish (e.g. a link to /foo or a root-level link to ../foo),
tree-ish (e.g. a link to `/foo` or a root-level link to `../foo`),
the portion of the link which is outside the tree will be
printed.
+
@ -175,15 +177,15 @@ respectively print:
OUTPUT
------
If `-t` is specified, one of the <type>.
If `-t` is specified, one of the `<type>`.
If `-s` is specified, the size of the <object> in bytes.
If `-s` is specified, the size of the `<object>` in bytes.
If `-e` is specified, no output, unless the <object> is malformed.
If `-e` is specified, no output, unless the `<object>` is malformed.
If `-p` is specified, the contents of <object> are pretty-printed.
If `-p` is specified, the contents of `<object>` are pretty-printed.
If <type> is specified, the raw (though uncompressed) contents of the <object>
If `<type>` is specified, the raw (though uncompressed) contents of the `<object>`
will be returned.
BATCH OUTPUT
@ -200,7 +202,7 @@ object, with placeholders of the form `%(atom)` expanded, followed by a
newline. The available atoms are:
`objectname`::
The 40-hex object name of the object.
The full hex representation of the object name.
`objecttype`::
The type of the object (the same as `cat-file -t` reports).
@ -215,8 +217,9 @@ newline. The available atoms are:
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
40-hex sha1 of the delta base object. Otherwise, expands to the
null sha1 (40 zeroes). See `CAVEATS` below.
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
below.
`rest`::
If this atom is used in the output string, input lines are split
@ -235,14 +238,14 @@ newline.
For example, `--batch` without a custom format would produce:
------------
<sha1> SP <type> SP <size> LF
<oid> SP <type> SP <size> LF
<contents> LF
------------
Whereas `--batch-check='%(objectname) %(objecttype)'` would produce:
------------
<sha1> SP <type> LF
<oid> SP <type> LF
------------
If a name is specified on stdin that cannot be resolved to an object in
@ -258,7 +261,7 @@ If a name is specified that might refer to more than one object (an ambiguous sh
<object> SP ambiguous LF
------------
If --follow-symlinks is used, and a symlink in the repository points
If `--follow-symlinks` is used, and a symlink in the repository points
outside the repository, then `cat-file` will ignore any custom format
and print:
@ -267,11 +270,11 @@ symlink SP <size> LF
<symlink> LF
------------
The symlink will either be absolute (beginning with a /), or relative
to the tree root. For instance, if dir/link points to ../../foo, then
<symlink> will be ../foo. <size> is the size of the symlink in bytes.
The symlink will either be absolute (beginning with a `/`), or relative
to the tree root. For instance, if dir/link points to `../../foo`, then
`<symlink>` will be `../foo`. `<size>` is the size of the symlink in bytes.
If --follow-symlinks is used, the following error messages will be
If `--follow-symlinks` is used, the following error messages will be
displayed:
------------

View File

@ -36,10 +36,17 @@ name is provided or known to the 'mailmap', ``Name $$<user@host>$$'' is
printed; otherwise only ``$$<user@host>$$'' is printed.
CONFIGURATION
-------------
See `mailmap.file` and `mailmap.blob` in linkgit:git-config[1] for how
to specify a custom `.mailmap` target file or object.
MAPPING AUTHORS
---------------
include::mailmap.txt[]
See linkgit:gitmailmap[5].
GIT

View File

@ -118,8 +118,9 @@ OPTIONS
-f::
--force::
When switching branches, proceed even if the index or the
working tree differs from `HEAD`. This is used to throw away
local changes.
working tree differs from `HEAD`, and even if there are untracked
files in the way. This is used to throw away local changes and
any untracked files or directories that are in the way.
+
When checking out paths from the index, do not fail upon unmerged
entries; instead, unmerged entries are ignored.

View File

@ -15,7 +15,7 @@ SYNOPSIS
[--dissociate] [--separate-git-dir <git dir>]
[--depth <depth>] [--[no-]single-branch] [--no-tags]
[--recurse-submodules[=<pathspec>]] [--[no-]shallow-submodules]
[--[no-]remote-submodules] [--jobs <n>] [--sparse]
[--[no-]remote-submodules] [--jobs <n>] [--sparse] [--[no-]reject-shallow]
[--filter=<filter>] [--] <repository>
[<directory>]
@ -149,6 +149,11 @@ objects from the source repository into a pack in the cloned repository.
--no-checkout::
No checkout of HEAD is performed after the clone is complete.
--[no-]reject-shallow::
Fail if the source repository is a shallow repository.
The 'clone.rejectShallow' configuration variable can be used to
specify the default.
--bare::
Make a 'bare' Git repository. That is, instead of
creating `<directory>` and placing the administrative

View File

@ -39,7 +39,7 @@ OPTIONS
--indent=<string>::
String to be printed at the beginning of each line.
--nl=<N>::
--nl=<string>::
String to be printed at the end of each line,
including newline character.

View File

@ -9,12 +9,13 @@ SYNOPSIS
--------
[verse]
'git commit' [-a | --interactive | --patch] [-s] [-v] [-u<mode>] [--amend]
[--dry-run] [(-c | -C | --fixup | --squash) <commit>]
[--dry-run] [(-c | -C | --squash) <commit> | --fixup [(amend|reword):]<commit>)]
[-F <file> | -m <msg>] [--reset-author] [--allow-empty]
[--allow-empty-message] [--no-verify] [-e] [--author=<author>]
[--date=<date>] [--cleanup=<mode>] [--[no-]status]
[-i | -o] [--pathspec-from-file=<file> [--pathspec-file-nul]]
[-S[<keyid>]] [--] [<pathspec>...]
[(--trailer <token>[(=|:)<value>])...] [-S[<keyid>]]
[--] [<pathspec>...]
DESCRIPTION
-----------
@ -71,7 +72,7 @@ OPTIONS
-p::
--patch::
Use the interactive patch selection interface to chose
Use the interactive patch selection interface to choose
which changes to commit. See linkgit:git-add[1] for
details.
@ -86,11 +87,44 @@ OPTIONS
Like '-C', but with `-c` the editor is invoked, so that
the user can further edit the commit message.
--fixup=<commit>::
Construct a commit message for use with `rebase --autosquash`.
The commit message will be the subject line from the specified
commit with a prefix of "fixup! ". See linkgit:git-rebase[1]
for details.
--fixup=[(amend|reword):]<commit>::
Create a new commit which "fixes up" `<commit>` when applied with
`git rebase --autosquash`. Plain `--fixup=<commit>` creates a
"fixup!" commit which changes the content of `<commit>` but leaves
its log message untouched. `--fixup=amend:<commit>` is similar but
creates an "amend!" commit which also replaces the log message of
`<commit>` with the log message of the "amend!" commit.
`--fixup=reword:<commit>` creates an "amend!" commit which
replaces the log message of `<commit>` with its own log message
but makes no changes to the content of `<commit>`.
+
The commit created by plain `--fixup=<commit>` has a subject
composed of "fixup!" followed by the subject line from <commit>,
and is recognized specially by `git rebase --autosquash`. The `-m`
option may be used to supplement the log message of the created
commit, but the additional commentary will be thrown away once the
"fixup!" commit is squashed into `<commit>` by
`git rebase --autosquash`.
+
The commit created by `--fixup=amend:<commit>` is similar but its
subject is instead prefixed with "amend!". The log message of
<commit> is copied into the log message of the "amend!" commit and
opened in an editor so it can be refined. When `git rebase
--autosquash` squashes the "amend!" commit into `<commit>`, the
log message of `<commit>` is replaced by the refined log message
from the "amend!" commit. It is an error for the "amend!" commit's
log message to be empty unless `--allow-empty-message` is
specified.
+
`--fixup=reword:<commit>` is shorthand for `--fixup=amend:<commit>
--only`. It creates an "amend!" commit with only a log message
(ignoring any changes staged in the index). When squashed by `git
rebase --autosquash`, it replaces the log message of `<commit>`
without making any other changes.
+
Neither "fixup!" nor "amend!" commits change authorship of
`<commit>` when applied by `git rebase --autosquash`.
See linkgit:git-rebase[1] for details.
--squash=<commit>::
Construct a commit message for use with `rebase --autosquash`.
@ -166,9 +200,21 @@ The `-m` option is mutually exclusive with `-c`, `-C`, and `-F`.
include::signoff-option.txt[]
--trailer <token>[(=|:)<value>]::
Specify a (<token>, <value>) pair that should be applied as a
trailer. (e.g. `git commit --trailer "Signed-off-by:C O Mitter \
<committer@example.com>" --trailer "Helped-by:C O Mitter \
<committer@example.com>"` will add the "Signed-off-by" trailer
and the "Helped-by" trailer to the commit message.)
The `trailer.*` configuration variables
(linkgit:git-interpret-trailers[1]) can be used to define if
a duplicated trailer is omitted, where in the run of trailers
each trailer would appear, and other details.
-n::
--no-verify::
This option bypasses the pre-commit and commit-msg hooks.
--[no-]verify::
By default, the pre-commit and commit-msg hooks are run.
When any of `--no-verify` or `-n` is given, these are bypassed.
See also linkgit:githooks[5].
--allow-empty::

View File

@ -71,6 +71,10 @@ codes are:
On success, the command returns the exit code 0.
A list of all available configuration variables can be obtained using the
`git help --config` command.
[[OPTIONS]]
OPTIONS
-------
@ -143,7 +147,13 @@ See also <<FILES>>.
-f config-file::
--file config-file::
Use the given config file instead of the one specified by GIT_CONFIG.
For writing options: write to the specified file rather than the
repository `.git/config`.
+
For reading options: read only from the specified file rather than from all
available files.
+
See also <<FILES>>.
--blob blob::
Similar to `--file` but use the given blob instead of a file. E.g.
@ -325,20 +335,18 @@ All writing options will per default write to the repository specific
configuration file. Note that this also affects options like `--replace-all`
and `--unset`. *'git config' will only ever change one file at a time*.
You can override these rules either by command-line options or by environment
variables. The `--global`, `--system` and `--worktree` options will limit
the file used to the global, system-wide or per-worktree file respectively.
The `GIT_CONFIG` environment variable has a similar effect, but you
can specify any filename you want.
You can override these rules using the `--global`, `--system`,
`--local`, `--worktree`, and `--file` command-line options; see
<<OPTIONS>> above.
ENVIRONMENT
-----------
GIT_CONFIG::
Take the configuration from the given file instead of .git/config.
Using the "--global" option forces this to ~/.gitconfig. Using the
"--system" option forces this to $(prefix)/etc/gitconfig.
GIT_CONFIG_GLOBAL::
GIT_CONFIG_SYSTEM::
Take the configuration from the given files instead from global or
system-level configuration. See linkgit:git[1] for details.
GIT_CONFIG_NOSYSTEM::
Whether to skip reading settings from the system-wide
@ -346,6 +354,28 @@ GIT_CONFIG_NOSYSTEM::
See also <<FILES>>.
GIT_CONFIG_COUNT::
GIT_CONFIG_KEY_<n>::
GIT_CONFIG_VALUE_<n>::
If GIT_CONFIG_COUNT is set to a positive number, all environment pairs
GIT_CONFIG_KEY_<n> and GIT_CONFIG_VALUE_<n> up to that number will be
added to the process's runtime configuration. The config pairs are
zero-indexed. Any missing key or value is treated as an error. An empty
GIT_CONFIG_COUNT is treated the same as GIT_CONFIG_COUNT=0, namely no
pairs are processed. These environment variables will override values
in configuration files, but will be overridden by any explicit options
passed via `git -c`.
+
This is useful for cases where you want to spawn multiple git commands
with a common configuration but cannot depend on a configuration file,
for example when writing scripts.
GIT_CONFIG::
If no `--file` option is provided to `git config`, use the file
given by `GIT_CONFIG` as if it were provided via `--file`. This
variable has no effect on other Git commands, and is mostly for
historical compatibility; there is generally no reason to use it
instead of the `--file` option.
[[EXAMPLES]]
EXAMPLES

View File

@ -159,3 +159,7 @@ empty string.
+
Components which are missing from the URL (e.g., there is no
username in the example above) will be left unset.
GIT
---
Part of the linkgit:git[1] suite

View File

@ -24,6 +24,18 @@ Usage:
[verse]
'git-cvsserver' [<options>] [pserver|server] [<directory> ...]
DESCRIPTION
-----------
This application is a CVS emulation layer for Git.
It is highly functional. However, not all methods are implemented,
and for those methods that are implemented,
not all switches are implemented.
Testing has been done using both the CLI CVS client, and the Eclipse CVS
plugin. Most functionality works fine with both of these clients.
OPTIONS
-------
@ -57,18 +69,6 @@ access still needs to be enabled by the `gitcvs.enabled` config option
unless `--export-all` was given, too.
DESCRIPTION
-----------
This application is a CVS emulation layer for Git.
It is highly functional. However, not all methods are implemented,
and for those methods that are implemented,
not all switches are implemented.
Testing has been done using both the CLI CVS client, and the Eclipse CVS
plugin. Most functionality works fine with both of these clients.
LIMITATIONS
-----------
@ -99,7 +99,7 @@ looks like
------
Only anonymous access is provided by pserve by default. To commit you
Only anonymous access is provided by pserver by default. To commit you
will have to create pserver accounts, simply add a gitcvs.authdb
setting in the config file of the repositories you want the cvsserver
to allow writes to, for example:
@ -114,21 +114,20 @@ The format of these files is username followed by the encrypted password,
for example:
------
myuser:$1Oyx5r9mdGZ2
myuser:$1$BA)@$vbnMJMDym7tA32AamXrm./
myuser:sqkNi8zPf01HI
myuser:$1$9K7FzU28$VfF6EoPYCJEYcVQwATgOP/
myuser:$5$.NqmNH1vwfzGpV8B$znZIcumu1tNLATgV2l6e1/mY8RzhUDHMOaVOeL1cxV3
------
You can use the 'htpasswd' facility that comes with Apache to make these
files, but Apache's MD5 crypt method differs from the one used by most C
library's crypt() function, so don't use the -m option.
files, but only with the -d option (or -B if your system suports it).
Alternatively you can produce the password with perl's crypt() operator:
-----
perl -e 'my ($user, $pass) = @ARGV; printf "%s:%s\n", $user, crypt($user, $pass)' $USER password
-----
Preferably use the system specific utility that manages password hash
creation in your platform (e.g. mkpasswd in Linux, encrypt in OpenBSD or
pwhash in NetBSD) and paste it in the right location.
Then provide your password via the pserver method, for example:
------
cvs -d:pserver:someuser:somepassword <at> server/path/repo.git co <HEAD_name>
cvs -d:pserver:someuser:somepassword@server:/path/repo.git co <HEAD_name>
------
No special setup is needed for SSH access, other than having Git tools
in the PATH. If you have clients that do not accept the CVS_SERVER
@ -138,7 +137,7 @@ Note: Newer CVS versions (>= 1.12.11) also support specifying
CVS_SERVER directly in CVSROOT like
------
cvs -d ":ext;CVS_SERVER=git cvsserver:user@server/path/repo.git" co <HEAD_name>
cvs -d ":ext;CVS_SERVER=git cvsserver:user@server/path/repo.git" co <HEAD_name>
------
This has the advantage that it will be saved in your 'CVS/Root' files and
you don't need to worry about always setting the correct environment
@ -186,8 +185,8 @@ allowing access over SSH.
+
--
------
export CVSROOT=:ext:user@server:/var/git/project.git
export CVS_SERVER="git cvsserver"
export CVSROOT=:ext:user@server:/var/git/project.git
export CVS_SERVER="git cvsserver"
------
--
4. For SSH clients that will make commits, make sure their server-side
@ -203,7 +202,7 @@ allowing access over SSH.
`project-master` directory:
+
------
cvs co -d project-master master
cvs co -d project-master master
------
[[dbbackend]]

View File

@ -63,9 +63,10 @@ OPTIONS
Automatically implies --tags.
--abbrev=<n>::
Instead of using the default 7 hexadecimal digits as the
abbreviated object name, use <n> digits, or as many digits
as needed to form a unique object name. An <n> of 0
Instead of using the default number of hexadecimal digits (which
will vary according to the number of objects in the repository with
a default of 7) of the abbreviated object name, use <n> digits, or
as many digits as needed to form a unique object name. An <n> of 0
will suppress long format, only showing the closest tag.
--candidates=<n>::
@ -139,8 +140,11 @@ at the end.
The number of additional commits is the number
of commits which would be displayed by "git log v1.0.4..parent".
The hash suffix is "-g" + unambiguous abbreviation for the tip commit
of parent (which was `2414721b194453f058079d897d13c4e377f92dc6`).
The hash suffix is "-g" + an unambigous abbreviation for the tip commit
of parent (which was `2414721b194453f058079d897d13c4e377f92dc6`). The
length of the abbreviation scales as the repository grows, using the
approximate number of objects in the repository and a bit of math
around the birthday paradox, and defaults to a minimum of 7.
The "g" prefix stands for "git" and is used to allow describing the version of
a software depending on the SCM the software is managed with. This is useful
in an environment where people may use different SCMs.

View File

@ -51,16 +51,20 @@ files on disk.
--staged is a synonym of --cached.
+
If --merge-base is given, instead of using <commit>, use the merge base
of <commit> and HEAD. `git diff --merge-base A` is equivalent to
`git diff $(git merge-base A HEAD)`.
of <commit> and HEAD. `git diff --cached --merge-base A` is equivalent to
`git diff --cached $(git merge-base A HEAD)`.
'git diff' [<options>] <commit> [--] [<path>...]::
'git diff' [<options>] [--merge-base] <commit> [--] [<path>...]::
This form is to view the changes you have in your
working tree relative to the named <commit>. You can
use HEAD to compare it with the latest commit, or a
branch name to compare with the tip of a different
branch.
+
If --merge-base is given, instead of using <commit>, use the merge base
of <commit> and HEAD. `git diff --merge-base A` is equivalent to
`git diff $(git merge-base A HEAD)`.
'git diff' [<options>] [--merge-base] <commit> <commit> [--] [<path>...]::

View File

@ -34,6 +34,14 @@ OPTIONS
This is the default behaviour; the option is provided to
override any configuration settings.
--rotate-to=<file>::
Start showing the diff for the given path,
the paths before it will move to end and output.
--skip-to=<file>::
Start showing the diff for the given path, skipping all
the paths before it.
-t <tool>::
--tool=<tool>::
Use the diff tool specified by <tool>. Valid values include

View File

@ -133,7 +133,7 @@ remember to run that, set `fetch.prune` globally, or
linkgit:git-config[1].
Here's where things get tricky and more specific. The pruning feature
doesn't actually care about branches, instead it'll prune local <->
doesn't actually care about branches, instead it'll prune local <-->
remote-references as a function of the refspec of the remote (see
`<refspec>` and <<CRTB,CONFIGURED REMOTE-TRACKING BRANCHES>> above).

View File

@ -235,6 +235,15 @@ and `date` to extract the named component. For email fields (`authoremail`,
without angle brackets, and `:localpart` to get the part before the `@` symbol
out of the trimmed email.
The raw data in an object is `raw`.
raw:size::
The raw data size of the object.
Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
because such language may not support arbitrary binary data in their string
variable type.
The message in a commit or a tag object is `contents`, from which
`contents:<part>` can be used to extract various parts out of:
@ -260,11 +269,9 @@ contents:lines=N::
The first `N` lines of the message.
Additionally, the trailers as interpreted by linkgit:git-interpret-trailers[1]
are obtained as `trailers` (or by using the historical alias
`contents:trailers`). Non-trailer lines from the trailer block can be omitted
with `trailers:only`. Whitespace-continuations can be removed from trailers so
that each trailer appears on a line by itself with its full content with
`trailers:unfold`. Both can be used together as `trailers:unfold,only`.
are obtained as `trailers[:options]` (or by using the historical alias
`contents:trailers[:options]`). For valid [:option] values see `trailers`
section of linkgit:git-log[1].
For sorting purposes, fields with numeric values sort in numeric order
(`objectsize`, `authordate`, `committerdate`, `creatordate`, `taggerdate`).

View File

@ -36,11 +36,28 @@ SYNOPSIS
DESCRIPTION
-----------
Prepare each commit with its patch in
one file per commit, formatted to resemble UNIX mailbox format.
Prepare each non-merge commit with its "patch" in
one "message" per commit, formatted to resemble a UNIX mailbox.
The output of this command is convenient for e-mail submission or
for use with 'git am'.
A "message" generated by the command consists of three parts:
* A brief metadata header that begins with `From <commit>`
with a fixed `Mon Sep 17 00:00:00 2001` datestamp to help programs
like "file(1)" to recognize that the file is an output from this
command, fields that record the author identity, the author date,
and the title of the change (taken from the first paragraph of the
commit log message).
* The second and subsequent paragraphs of the commit log message.
* The "patch", which is the "diff -p --stat" output (see
linkgit:git-diff[1]) between the commit and its parent.
The log message and the patch is separated by a line with a
three-dash line.
There are two ways to specify which commits to operate on.
1. A single commit, <since>, specifies that the commits leading
@ -221,6 +238,11 @@ populated with placeholder text.
`--subject-prefix` option) has ` v<n>` appended to it. E.g.
`--reroll-count=4` may produce `v4-0001-add-makefile.patch`
file that has "Subject: [PATCH v4 1/20] Add makefile" in it.
`<n>` does not have to be an integer (e.g. "--reroll-count=4.4",
or "--reroll-count=4rev2" are allowed), but the downside of
using such a reroll-count is that the range-diff/interdiff
with the previous version does not state exactly which
version the new interation is compared against.
--to=<email>::
Add a `To:` header to the email headers. This is in addition
@ -667,10 +689,10 @@ You can also use `git format-patch --base=P -3 C` to generate patches
for A, B and C, and the identifiers for P, X, Y, Z are appended at the
end of the first message.
If set `--base=auto` in cmdline, it will track base commit automatically,
the base commit will be the merge base of tip commit of the remote-tracking
If set `--base=auto` in cmdline, it will automatically compute
the base commit as the merge base of tip commit of the remote-tracking
branch and revision-range specified in cmdline.
For a local branch, you need to track a remote branch by `git branch
For a local branch, you need to make it to track a remote branch by `git branch
--set-upstream-to` before using this option.
EXAMPLES
@ -718,6 +740,14 @@ use it only when you know the recipient uses Git to apply your patch.
$ git format-patch -3
------------
CAVEATS
-------
Note that `format-patch` will omit merge commits from the output, even
if they are part of the requested range. A simple "patch" does not
include enough information for the receiving end to reproduce the same
merge commit.
SEE ALSO
--------
linkgit:git-am[1], linkgit:git-send-email[1]

View File

@ -117,12 +117,14 @@ NOTES
'git gc' tries very hard not to delete objects that are referenced
anywhere in your repository. In particular, it will keep not only
objects referenced by your current set of branches and tags, but also
objects referenced by the index, remote-tracking branches, notes saved
by 'git notes' under refs/notes/, reflogs (which may reference commits
in branches that were later amended or rewound), and anything else in
the refs/* namespace. If you are expecting some objects to be deleted
and they aren't, check all of those locations and decide whether it
makes sense in your case to remove those references.
objects referenced by the index, remote-tracking branches, reflogs
(which may reference commits in branches that were later amended or
rewound), and anything else in the refs/* namespace. Note that a note
(of the kind created by 'git notes') attached to an object does not
contribute in keeping the object alive. If you are expecting some
objects to be deleted and they aren't, check all of those locations
and decide whether it makes sense in your case to remove those
references.
On the other hand, when 'git gc' runs concurrently with another process,
there is a risk of it deleting an object that the other process is using

View File

@ -38,38 +38,6 @@ are lists of one or more search expressions separated by newline
characters. An empty string as search expression matches all lines.
CONFIGURATION
-------------
grep.lineNumber::
If set to true, enable `-n` option by default.
grep.column::
If set to true, enable the `--column` option by default.
grep.patternType::
Set the default matching behavior. Using a value of 'basic', 'extended',
'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
grep.extendedRegexp::
If set to true, enable `--extended-regexp` option by default. This
option is ignored when the `grep.patternType` option is set to a value
other than 'default'.
grep.threads::
Number of grep worker threads to use. If unset (or set to 0), Git will
use as many threads as the number of logical cores available.
grep.fullName::
If set to true, enable `--full-name` option by default.
grep.fallbackToNoIndex::
If set to true, fall back to git grep --no-index if git grep
is executed outside of a git repository. Defaults to false.
OPTIONS
-------
--cached::
@ -363,6 +331,38 @@ with multiple threads might perform slower than single threaded if `--textconv`
is given and there're too many text conversions. So if you experience low
performance in this case, it might be desirable to use `--threads=1`.
CONFIGURATION
-------------
grep.lineNumber::
If set to true, enable `-n` option by default.
grep.column::
If set to true, enable the `--column` option by default.
grep.patternType::
Set the default matching behavior. Using a value of 'basic', 'extended',
'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
grep.extendedRegexp::
If set to true, enable `--extended-regexp` option by default. This
option is ignored when the `grep.patternType` option is set to a value
other than 'default'.
grep.threads::
Number of grep worker threads to use. If unset (or set to 0), Git will
use as many threads as the number of logical cores available.
grep.fullName::
If set to true, enable `--full-name` option by default.
grep.fallbackToNoIndex::
If set to true, fall back to git grep --no-index if git grep
is executed outside of a git repository. Defaults to false.
GIT
---
Part of the linkgit:git[1] suite

View File

@ -8,8 +8,10 @@ git-help - Display help information about Git
SYNOPSIS
--------
[verse]
'git help' [-a|--all [--[no-]verbose]] [-g|--guides]
[-i|--info|-m|--man|-w|--web] [COMMAND|GUIDE]
'git help' [-a|--all [--[no-]verbose]]
[[-i|--info] [-m|--man] [-w|--web]] [COMMAND|GUIDE]
'git help' [-g|--guides]
'git help' [-c|--config]
DESCRIPTION
-----------
@ -58,8 +60,7 @@ OPTIONS
-g::
--guides::
Prints a list of the Git concept guides on the standard output. This
option overrides any given command or guide name.
Prints a list of the Git concept guides on the standard output.
-i::
--info::

View File

@ -16,7 +16,9 @@ A simple CGI program to serve the contents of a Git repository to Git
clients accessing the repository over http:// and https:// protocols.
The program supports clients fetching using both the smart HTTP protocol
and the backwards-compatible dumb HTTP protocol, as well as clients
pushing using the smart HTTP protocol.
pushing using the smart HTTP protocol. It also supports Git's
more-efficient "v2" protocol if properly configured; see the
discussion of `GIT_PROTOCOL` in the ENVIRONMENT section below.
It verifies that the directory has the magic file
"git-daemon-export-ok", and it will refuse to export any Git directory
@ -77,6 +79,18 @@ Apache 2.x::
SetEnv GIT_PROJECT_ROOT /var/www/git
SetEnv GIT_HTTP_EXPORT_ALL
ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/
# This is not strictly necessary using Apache and a modern version of
# git-http-backend, as the webserver will pass along the header in the
# environment as HTTP_GIT_PROTOCOL, and http-backend will copy that into
# GIT_PROTOCOL. But you may need this line (or something similar if you
# are using a different webserver), or if you want to support older Git
# versions that did not do that copying.
#
# Having the webserver set up GIT_PROTOCOL is perfectly fine even with
# modern versions (and will take precedence over HTTP_GIT_PROTOCOL,
# which means it can be used to override the client's request).
SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0
----------------------------------------------------------------
+
To enable anonymous read access but authenticated write access,
@ -264,6 +278,16 @@ a repository with an extremely large number of refs. The value can be
specified with a unit (e.g., `100M` for 100 megabytes). The default is
10 megabytes.
Clients may probe for optional protocol capabilities (like the v2
protocol) using the `Git-Protocol` HTTP header. In order to support
these, the contents of that header must appear in the `GIT_PROTOCOL`
environment variable. Most webservers will pass this header to the CGI
via the `HTTP_GIT_PROTOCOL` variable, and `git-http-backend` will
automatically copy that to `GIT_PROTOCOL`. However, some webservers may
be more selective about which headers they'll pass, in which case they
need to be configured explicitly (see the mention of `Git-Protocol` in
the Apache config from the earlier EXAMPLES section).
The backend process sets GIT_COMMITTER_NAME to '$REMOTE_USER' and
GIT_COMMITTER_EMAIL to '$\{REMOTE_USER}@http.$\{REMOTE_ADDR\}',
ensuring that any reflogs created by 'git-receive-pack' contain some

View File

@ -41,11 +41,17 @@ commit-id::
<commit-id>['\t'<filename-as-in--w>]
--packfile=<hash>::
Instead of a commit id on the command line (which is not expected in
For internal use only. Instead of a commit id on the command
line (which is not expected in
this case), 'git http-fetch' fetches the packfile directly at the given
URL and uses index-pack to generate corresponding .idx and .keep files.
The hash is used to determine the name of the temporary file and is
arbitrary. The output of index-pack is printed to stdout.
arbitrary. The output of index-pack is printed to stdout. Requires
--index-pack-args.
--index-pack-args=<args>::
For internal use only. The command to run on the contents of the
downloaded pack. Arguments are URL-encoded separated by spaces.
--recover::
Verify that everything reachable from target is fetched. Used after

View File

@ -9,17 +9,18 @@ git-index-pack - Build pack index file for an existing packed archive
SYNOPSIS
--------
[verse]
'git index-pack' [-v] [-o <index-file>] <pack-file>
'git index-pack' [-v] [-o <index-file>] [--[no-]rev-index] <pack-file>
'git index-pack' --stdin [--fix-thin] [--keep] [-v] [-o <index-file>]
[<pack-file>]
[--[no-]rev-index] [<pack-file>]
DESCRIPTION
-----------
Reads a packed archive (.pack) from the specified file, and
builds a pack index file (.idx) for it. The packed archive
together with the pack index can then be placed in the
objects/pack/ directory of a Git repository.
builds a pack index file (.idx) for it. Optionally writes a
reverse-index (.rev) for the specified pack. The packed
archive together with the pack index can then be placed in
the objects/pack/ directory of a Git repository.
OPTIONS
@ -35,6 +36,13 @@ OPTIONS
fails if the name of packed archive does not end
with .pack).
--[no-]rev-index::
When this flag is provided, generate a reverse index
(a `.rev` file) corresponding to the given pack. If
`--verify` is given, ensure that the existing
reverse index is correct. Takes precedence over
`pack.writeReverseIndex`.
--stdin::
When this flag is provided, the pack is read from stdin
instead and a copy is then written to <pack-file>. If
@ -74,11 +82,22 @@ OPTIONS
--strict::
Die, if the pack contains broken objects or links.
--progress-title::
For internal use only.
+
Set the title of the progress bar. The title is "Receiving objects" by
default and "Indexing objects" when `--stdin` is specified.
--check-self-contained-and-connected::
Die if the pack contains broken links. For internal use only.
--fsck-objects::
Die if the pack contains broken objects. For internal use only.
For internal use only.
+
Die if the pack contains broken objects. If the pack contains a tree
pointing to a .gitmodules blob that does not exist, prints the hash of
that blob (for the caller to check) after the hash that goes into the
name of the pack/idx file (see "Notes").
--threads=<n>::
Specifies the number of threads to spawn when resolving

View File

@ -232,25 +232,38 @@ trailer.<token>.ifmissing::
that option for trailers with the specified <token>.
trailer.<token>.command::
This option can be used to specify a shell command that will
be called to automatically add or modify a trailer with the
specified <token>.
This option behaves in the same way as 'trailer.<token>.cmd', except
that it doesn't pass anything as argument to the specified command.
Instead the first occurrence of substring $ARG is replaced by the
value that would be passed as argument.
+
When this option is specified, the behavior is as if a special
'<token>=<value>' argument were added at the beginning of the command
line, where <value> is taken to be the standard output of the
specified command with any leading and trailing whitespace trimmed
off.
The 'trailer.<token>.command' option has been deprecated in favor of
'trailer.<token>.cmd' due to the fact that $ARG in the user's command is
only replaced once and that the original way of replacing $ARG is not safe.
+
If the command contains the `$ARG` string, this string will be
replaced with the <value> part of an existing trailer with the same
<token>, if any, before the command is launched.
When both 'trailer.<token>.cmd' and 'trailer.<token>.command' are given
for the same <token>, 'trailer.<token>.cmd' is used and
'trailer.<token>.command' is ignored.
trailer.<token>.cmd::
This option can be used to specify a shell command that will be called:
once to automatically add a trailer with the specified <token>, and then
each time a '--trailer <token>=<value>' argument to modify the <value> of
the trailer that this option would produce.
+
If some '<token>=<value>' arguments are also passed on the command
line, when a 'trailer.<token>.command' is configured, the command will
also be executed for each of these arguments. And the <value> part of
these arguments, if any, will be used to replace the `$ARG` string in
the command.
When the specified command is first called to add a trailer
with the specified <token>, the behavior is as if a special
'--trailer <token>=<value>' argument was added at the beginning
of the "git interpret-trailers" command, where <value>
is taken to be the standard output of the command with any
leading and trailing whitespace trimmed off.
+
If some '--trailer <token>=<value>' arguments are also passed
on the command line, the command is called again once for each
of these arguments with the same <token>. And the <value> part
of these arguments, if any, will be passed to the command as its
first argument. This way the command can produce a <value> computed
from the <value> passed in the '--trailer <token>=<value>' argument.
EXAMPLES
--------
@ -333,6 +346,55 @@ subject
Fix #42
------------
* Configure a 'help' trailer with a cmd use a script `glog-find-author`
which search specified author identity from git log in git repository
and show how it works:
+
------------
$ cat ~/bin/glog-find-author
#!/bin/sh
test -n "$1" && git log --author="$1" --pretty="%an <%ae>" -1 || true
$ git config trailer.help.key "Helped-by: "
$ git config trailer.help.ifExists "addIfDifferentNeighbor"
$ git config trailer.help.cmd "~/bin/glog-find-author"
$ git interpret-trailers --trailer="help:Junio" --trailer="help:Couder" <<EOF
> subject
>
> message
>
> EOF
subject
message
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
------------
* Configure a 'ref' trailer with a cmd use a script `glog-grep`
to grep last relevant commit from git log in the git repository
and show how it works:
+
------------
$ cat ~/bin/glog-grep
#!/bin/sh
test -n "$1" && git log --grep "$1" --pretty=reference -1 || true
$ git config trailer.ref.key "Reference-to: "
$ git config trailer.ref.ifExists "replace"
$ git config trailer.ref.cmd "~/bin/glog-grep"
$ git interpret-trailers --trailer="ref:Add copyright notices." <<EOF
> subject
>
> message
>
> EOF
subject
message
Reference-to: 8bc9a0c769 (Add copyright notices., 2005-04-07)
------------
* Configure a 'see' trailer with a command to show the subject of a
commit that is related, and show how it works:
+

View File

@ -39,7 +39,9 @@ OPTIONS
full ref name (including prefix) will be printed. If 'auto' is
specified, then if the output is going to a terminal, the ref names
are shown as if 'short' were given, otherwise no ref names are
shown. The default option is 'short'.
shown. The option `--decorate` is short-hand for `--decorate=short`.
Default to configuration value of `log.decorate` if configured,
otherwise, `auto`.
--decorate-refs=<pattern>::
--decorate-refs-exclude=<pattern>::
@ -107,47 +109,15 @@ DIFF FORMATTING
By default, `git log` does not generate any diff output. The options
below can be used to show the changes made by each commit.
Note that unless one of `-c`, `--cc`, or `-m` is given, merge commits
will never show a diff, even if a diff format like `--patch` is
selected, nor will they match search options like `-S`. The exception is
when `--first-parent` is in use, in which merges are treated like normal
single-parent commits (this can be overridden by providing a
combined-diff option or with `--no-diff-merges`).
-c::
With this option, diff output for a merge commit
shows the differences from each of the parents to the merge result
simultaneously instead of showing pairwise diff between a parent
and the result one at a time. Furthermore, it lists only files
which were modified from all parents.
--cc::
This flag implies the `-c` option and further compresses the
patch output by omitting uninteresting hunks whose contents in
the parents have only two variants and the merge result picks
one of them without modification.
--combined-all-paths::
This flag causes combined diffs (used for merge commits) to
list the name of the file from all parents. It thus only has
effect when -c or --cc are specified, and is likely only
useful if filename changes are detected (i.e. when either
rename or copy detection have been requested).
-m::
This flag makes the merge commits show the full diff like
regular commits; for each merge parent, a separate log entry
and diff is generated. An exception is that only diff against
the first parent is shown when `--first-parent` option is given;
in that case, the output represents the changes the merge
brought _into_ the then-current branch.
--diff-merges=off::
--no-diff-merges::
Disable output of diffs for merge commits (default). Useful to
override `-m`, `-c`, or `--cc`.
Note that unless one of `--diff-merges` variants (including short
`-m`, `-c`, and `--cc` options) is explicitly given, merge commits
will not show a diff, even if a diff format like `--patch` is
selected, nor will they match search options like `-S`. The exception
is when `--first-parent` is in use, in which case `first-parent` is
the default format.
:git-log: 1
:diff-merges-default: `off`
include::diff-options.txt[]
include::diff-generate-patch.txt[]

Some files were not shown because too many files have changed in this diff Show More